Register Log In

Forums Bug Reports Unicode $upper $lower $isupper $islower

Print Thread

Re: Unicode $upper $lower $isupper $islower maroon #267822 06/10/20 08:02 PM
Joined: Dec 2002 Posts: 5,421 London, UK Khaled Hoopy frood
Khaled Hoopy frood Joined: Dec 2002 Posts: 5,421 London, UK	I have made a few changes in the next beta that co-ordinate CRT vs API calls relating to the lower/upper case identifiers you used in your examples. These resolve some of the differences you point out, however the Windows APIs are still classifying many characters in the way you describe above. You will need to look into this further to determine why this is the case. The best I can do is to use the APIs provided. Regarding characters like the German Eszett, note that Unicode can be asymmetric. There is no guarantee that converting a letter from lower to upper to lower case will result in the same letter. To make matters more complicated, correct mapping of some unicode characters/ranges depends on locale as well as transformation options, eg. see LCMapStringEx(), so there is a lot more to it than just lower/upper case. In addition, although mIRC uses UTF-16, and Windows itself uses UTF-16, which means API calls generally handle surrogate pairs/planes, that does not mean these are handled in all contexts. While mIRC was changed to use UTF-16, there are many places where surrogate pairs can be split while parsing text, which is where work still needs to be done.

Entire Thread
Subject	Posted By	Posted
Unicode $upper $lower $isupper $islower	maroon	30/09/20 05:45 AM
Re: Unicode $upper $lower $isupper $islower	Khaled	30/09/20 06:45 AM
Re: Unicode $upper $lower $isupper $islower	Protopia	30/09/20 09:04 AM
Re: Unicode $upper $lower $isupper $islower	Khaled	06/10/20 08:02 PM
Re: Unicode $upper $lower $isupper $islower	Protopia	06/10/20 08:19 PM

Link Copied to Clipboard