https://forums.mirc.com/ubbthreads.php/topics/225583/upper-and-lowercase
https://forums.mirc.com/ubbthreads.php/topics/214238/lower-upper

I've read these threads and I still don't understand the rules for what makes a character be considered be uppercase vs lowercase, when the character is in the higher unicode ranges. There's 5 color-coded groups in this alias where the combo of results for these 4 identifiers is either a bug, or there are additional rules which govern them.

Based on what is used for the normal 33-126 range, I'd assumed there would be a few simple rules.

* If a character could be returned by $upper, then $isupper(char) would be $true

* If a character could be returned by $lower, then $islower(char) would be $true

The above 2 means many characters like '123' would return $true for both $isupper and $islower

* If $upper(char) and $lower(char) were different from each other, then $isupper($upper(char)) must be $true and $isupper($lower(char)) must be $false. Also $islower($lower(char)) must be $true and $islower($upper(char)) must be $false.

* If $upper(char) and $lower(char) returned the same codepoint, then it shouldn't be possible for exactly 1 of $isupper(char) $islower(char) to be $true and the other be $false.

However, when looking at the whole unicode range, there are many more exceptions to the above 'rules' than there are characters who comply with them.

This alias examines all 65535 codepoints and returns a bitflag group of 1's or 0's based on whether they return $true = 1 or $false = 0 for:

* $isupper(char)
* $islower(char)
* $isupper($upper(char))
* $islower($lower(char))
* $asc($upper(char)) == $asc($lower(char))

I've color-code a portion of the unicode range to be displayed based on not complying with the above 'rules'. So, either there are some bugs in how a few unicode characters are handled, or there are additional rules that I'm not aware of, or there's a LOT of exceptions.

* pink (01011)
These are 282 codepoints where $islower(char) is $true and $isupper(char) is $false, yet they don't have $upper(char) being different from them.

* tan (10101)
These are 33 codepoints where $isupper(char) is $true and $islower(char) is $false, yet they don't have $lower(char) being different from them.

For the above 2 groups, it seems logical that a character could have only 1 of the 2 $isupper(char) or $islower(char) being $true without having $upper(char) and $lower(char) being different from each other. However, when looking at individual cases I have trouble finding a solution, and I don't know enough about the other languages to know if these results are legit.

For $chr(223), webpages say that this is lowercase, but I can't find a reference to an uppercase equivalent, so maybe it is possible for some characters to be used only in lowercase text without having an uppercase complement. Though, if $chr(223) is lowercase-only, I'm not sure what a solution would be besides the current behavior of $upper($chr(223)) displaying $chr(223) unchanged even though that means $isupper($upper($chr(223))) is $false.

For $chr(304), webpages say this is uppercase, but for linking a lowercase equivalent, they point at the 7bit $chr(105) 'i'. But whether it's a good idea to have a 7-bit codepoint be returned as the $lower of a codepoint above 128, I dunno.

* red (00000)
These are 32 codepoints where $isupper(char) and $islower(char) both report $false, yet they have a $lower(char) and $upper(char) which are different from each other.

* maroon (11110)
These are 79 codepoints where $isupper(char) and $islower(char) both report $true, yet they have a $lower(char) and $upper(char) which are different from each other.

These 2 red and maroon groups seem to be more of a problem, because when $upper(char) and $lower(char) are different from each other, then it seems logical that it shouldn't be possible for $isupper(char) or $islower(char) to either have both be $true, or have both be $false.

* black (00001)
These were 45501 codepoints too numerous to display. These all report $isupper(char) and $islower(char) as both being $false, yet they both can be displayed in the $upper() and $lower() outputs. There were an additional 15725 characters which did follow the above 'rules', where they report $true for both $isupper(char) and $islower(char) since they didn't have an uppercase or lowercase form different than themselves.

Code
alias upperlower_test {
  var %i 1 | if (!$hget(test)) hmake -s test 1 | hdel -sw test z????? | hdel -sw test uplow.*
  while (%i isnum 1-65535) {
    var %char $chr(%i) , %up $upper(%char) , %lo $lower(%char)
    var %asc.char $asc(%char), %asc.up $asc(%up) , %asc.lo $asc(%lo)
    if ($isupper(%up) == $false) hinc -m test uplow.$isupper.says.output.of.$upper.is.$false
    if ($islower(%lo) == $false) hinc -m test uplow.$islower.says.output.of.$lower.is.$false
    if ($isupper(%char))                            var %b1 1 | else var %b1 0
    if ($islower(%char))                            var %b2 1 | else var %b2 0
    if ($isupper($upper(%char)))                    var %b3 1 | else var %b3 0
    if ($islower($lower(%char)))                    var %b4 1 | else var %b4 0
    if ($asc($upper(%char)) == $asc($lower(%char))) var %b5 1 | else var %b5 0
    var %a z $+ $+(%b1,%b2,%b3,%b4,%b5) | hinc test %a
    if (%a !isin z00001 z10110 z01110 z11111 z01011 z10101 z11110 z00000) {
      echo 12 -a debug: %a %i %char : isupper $isupper(%char) islower $islower(%char) * upperchar $isupper(%up) %asc.up * lowerchar $islower(%lo) %asc.lo
      ;z00001 = not upper not lower upper(char) same as lower(char)
      ;z11111 = normal non-alpha
      ;z11110 = is both upper and lower yet upper(char) != lower(char)
      ;z10110 = normal uppercase
      ;z01110 = normal lowercase
      ;z00000 = upper(char) != lower(char) yet isupper(char) islower(char) isupper(upper(char)) islower(lower(char) all false
      ;z00001 = isupper(char) islower(char) isupper(upper(char)) islower(lower(char) all false * upper(char)==lower(char)
      ;z01011 = upper(char)=false lower(char)=true yet upper(char) and lower(char) both the SAME
    }
    if (%a == z01011) echo 13 -ag %i %char $+(U+,$base(%i,10,16,4)) how can isupper(char) be $isupper(%char) and islower(char) be $islower(%char) though asc(upper(char) $asc($upper(%char)) === asc(lower(char)) $asc($lower(%char))
    if (%a == z10101) echo  7 -ag %i %char $+(U+,$base(%i,10,16,4)) how can isupper(char) be $isupper(%char) and islower(char) be $islower(%char) though asc(upper(char) $asc($upper(%char)) === asc(lower(char)) $asc($lower(%char))
    if (%a == z11110) echo  5 -ag %i %char $+(U+,$base(%i,10,16,4)) how can isupper(char) be $isupper(%char) and islower(char) both be $islower(%char) though upper(char) $upper(%char) $asc($upper(%char)) !== lower(char) $lower(%char) $asc($lower(%char))
    if (%a == z00000) echo  4 -ag %i %char $+(U+,$base(%i,10,16,4)) how can isupper(char) be $isupper(%char) and islower(char) both be $islower(%char) though upper(char) $upper(%char) $asc($upper(%char)) !== lower(char) $lower(%char) $asc($lower(%char))
    inc %i | if (%i = 55296) var %i 57344
  }
  var %a | noop $hfind(test,z11*,0,w,inc %a $hget(test,$1)) | echo -a upper(char) .true and lower(char) .true: %a
  var %a | noop $hfind(test,z00*,0,w,inc %a $hget(test,$1)) | echo -a upper(char) false and lower(char) false:  %a
  var %a | noop $hfind(test,z10*,0,w,inc %a $hget(test,$1)) | echo -a upper .true and lower false: %a
  var %a | noop $hfind(test,z01*,0,w,inc %a $hget(test,$1)) | echo -a upper false and lower .true: %a
  echo -ag *isupper() says false to char output by *$upper(): $hget(test,uplow.$isupper.says.output.of.$upper.is.$false)
  echo -ag *islower() says false to char output by *$lower(): $hget(test,uplow.$islower.says.output.of.$lower.is.$false)
  echo -a ====
  echo    -ag isupper|islower|isupper(upper(char))|islower(lower(char))|upper(char)==lower(char)
  echo    -ag 10110 normal uppercase: $hget(test,z10110)
  echo    -ag 01110 normal lowercase: $hget(test,z01110)
  echo    -ag 11111 normal non-alpha, isuppper() and islower() both $true and upper(char)===lower(char): $hget(test,z11111)
  echo    -ag 00001 isupper() & islower() both say $false when upper(char) === lower(char): $hget(test,z00001)
  echo  7 -ag 10101 how can isupper(char) be true and islower(char) be false yet upper(char) is same as lower(char): $hget(test,z10101)
  echo 13 -ag 01011 how can islower(char) be true and isupper(char) be false yet upper(char) is same as lower(char): $hget(test,z01011)
  echo  4 -ag 00000 how can isupper(char) islower(char) isupper($upper(char)) islower($lower(char)) all be false when upper(char) !== lower(char): $hget(test,z00000)
  echo  5 -ag 11110 how can isupper(char) islower(char) isupper($upper(char)) islower($lower(char)) all be $true when upper(char) !== lower(char): $hget(test,z11110)
}