The Problem:Korean characters encoded in UTF-8 appear on the receiving client as garbled text mixed with capitalised raw commands. (But this may not be limited to Korean.)
Both clients:- are mIRC 6.17
- have Multibyte display, Multibyte editbox, SJIS/JIS conversion, and process ANSI codes enabled
- use a font containing Hangul characters (I tested with Arial Unicode MS, Code2000, and Bitstream Cyberbit. -- Hangul is the Korean language.)
- have thier font's script set to "Hangul" (This is done to be absolutely sure both clients are expecting Korean. -- When both are using UTF-8, the font's script should not matter, but oddly enough, it does matter in this experiment. That's another bug.)
- have UTF-8 set to "Display and Encode"
I tested this first with clones, then with copies of mIRC in separate directories, and lastly with several other users in various channels. All produced the same results.
I also tested this on networks that used CHARSET=ascii (like EFnet, DALnet) and CHARSET=utf-8 (like UniLang). The network's character set should not (and did not) make a difference in the results.
On one client, type this Korean (Hangul) character:걹
The HTML code for it is &
[/b]#[b]44153;
In Windows Character Map, it is labeled as:
U+AC79 (0x819D): Hangul Syllable Kiyeok Eo Rieulkiyeok
*Note: To view it correctly in IE, go to menubar>View>Encoding and select "Unicode (UTF-8)". If that doesn't work, uncheck "Auto-Select", choose "Unicode (UTF-8)" again, or refresh the page. It should appear above as a single Korean character. If it appears correctly, you should be able to copy it from this page and try it yourself.
Results:On the sending client, it will appear correctly and immediately.
On the receiving client, it will appear as garbled text with a capitalised raw command, even though both clients are set to Hangul and set to "display and encode" UTF-8.
It will take a while to appear on the receiving client. If you wait, it will appear with a raw PONG command or a list of nicks. If you want to make it appear on the receiving client immediately, separately send any ASCII string in a second message. Then, it will appear with a raw PRIVMSG command followed by that ASCII string.
These results are not supposed to happen. The receiving client should immediately see the single Korean character--not garbled text after a long delay.
The same results can be achieved using
other Korean (Hangul) characters such as:
굥
HTML code: &
[/b]#[b]44389;
Windows Character Map:
U+AD65 (0x828B): Hangul Syllable Kiyeok Yo leung
And, again, this may not be limited to Korean.