mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: Sep 2007
Posts: 1
M
MizardX Offline OP
Mostly harmless
OP Offline
Mostly harmless
M
Joined: Sep 2007
Posts: 1
I don't know if this is a bug or missing feature. As the decoding currently works, if there are any character with character-code above 128, mIRC falls back on displaying the whole line as non-unicode.

This has as a consequence that if the user uses some character with value above 128 in the timestamp he can't show unicode at all. If he instead encodes those characters as UTF-8, they will display as junk if he recieves a message which contains characters above 128.

If the decoding instead is done on a per-character basis, the user could much more easily mix unicode-character with non-unicode ones (128-255). If mIRC would encounter a character above 128, that is not part of a utf-character sequence, it could instead of reverting to non-unicode on the while line, just render that single character as non-unicode.

For example, without utf-8 timestamp:
«12:34:56» {Dennie} utf-8: åäö
«12:34:59» {Dennie} non-utf: åäö

With utf-8 timestamp:
«12:35:34» {Dennie} utf-8: åäö
«12:35:37» {Dennie} non-utf: åäö

One problem with this new approach would be if someone accidently sends a utf-sequence without utf-encoding enabled it will be displayed as a unicode-character. But this problem also exists in the current version.

Last edited by MizardX; 23/03/08 03:21 PM.
Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
This is an issue with how unicode works. You can use $utfencode to get around the problem, though it's not the easiest thing to do if the text can be anything.


Invision Support
#Invision on irc.irchighway.net
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Well, ideally this is an issue with how unicode works, however, mIRC *could* decide not to revert the entire line to ANSI if it fails the "is unicode" test. This would be how your webbrowser works, as well... ie. if you have invalid unicode output in a utf-8 page, it won't revert the entire page to ansi, only the offending bytes.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
True. And it would be great if it handled it that way.


Invision Support
#Invision on irc.irchighway.net
Joined: Apr 2004
Posts: 871
Sat Offline
Hoopy frood
Offline
Hoopy frood
Joined: Apr 2004
Posts: 871
an acceptable compromise might be to do it on a per-word basis


Saturn, QuakeNet staff

Link Copied to Clipboard