mIRC Home    About    Download    Register    News    Help

Print Thread
#221090 07/05/10 12:33 PM
Joined: Oct 2003
Posts: 214
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 214
Chr 127-159 won't be displayed anymore.


one step closer to world domination
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
Not a bug, see this post for an explanation.


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
Joined: Oct 2003
Posts: 214
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 214
Well uhm what about comparing text from old clients with new scripts? Do i have to add for each thing 2 if requests?

Example:
on *:text:*:#:{ if (mäh == $1-) || (mäh == $utfencode($1-)) { } }


one step closer to world domination
Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
First, ä isn't in that range, so let's look at something that is: —

You won't want to use $utfencode() at all. You can do a $replace of the old character with the unicode version of it (the character isn't gone, it just has a new code) before the check if you need to.

$replace($1-,$chr(151),$chr(8212))

That would be placed on the 7.x client. The 6.35 client should be able to handle just a check of for — or whatever other character.

Or you could just do if ($1- == text $+ $chr(151) $+ text) instead of if($1- == text—text). The downside here is that you would need 2 checks (one for each way). The first example only requires the one check.

I'm sure there are other ways as well.


Invision Support
#Invision on irc.irchighway.net
Joined: Oct 2003
Posts: 214
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 214
Chr 128-159 are not being converted as they where before.

Output from a loop through chr 127-255
Code:
utflist {
  var %x = 126
  while (%x < 255) {
    inc %x
    echo * chr( $+ %x $+ ) => $asc($left($utfencode($chr(%x)),1)) $asc($right($utfencode($chr(%x)),1))
  }
}

returned on mIRC 6.35:
http://nopaste.php-q.net/284557

and this on 7.02
http://nopaste.php-q.net/284558



Both mIRC where running on Win7


one step closer to world domination
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
As mentioned above, you should not be calling $utfencode on data from 7.x as it is already Unicode.

$asc($chr(%x)) should be sufficient in 7.x

Again, this is not a bug, it is an intentional benefit of mIRC's Unicode support

To deal with old scripts, use this alias:

Code:
alias myutfencode { return $iif($version < 7, $utfencode($1-), $1-) }


Though I wonder if maybe mIRC could just no-op the $utfencode/$utfdecode aliases from 7.x on to avoid confusion.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Oct 2003
Posts: 214
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 214
Originally Posted By: argv0
As mentioned above, you should not be calling $utfencode on data from 7.x as it is already Unicode.

It is not unless it's displayed.

So then it's the editbox or what? If i type ALT+0159 i can see Ÿ (Latin Captial Letter Y with diaeresis) and when i send it or echo it, it disappears. Or better it becomes a control code.


one step closer to world domination
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Quote:
"The characters 128-159 are not used in ISO 8859-1 and Unicode"


From http://www.pemberley.com/janeinfo/latin1.html

Therefore $chr(159) is meaningless to mIRC, it should not display anything. The issue with the editbox might be an issue with font-linking and Windows handling things a little differently than mIRC's end, which is strictly UTF-8. As Khaled has said, what happens in the editbox is entirely defined by MS's Richedit control behaviour.

Latin Capital Letter Y with Diaeresis is U+0178: http://www.fileformat.info/info/unicode/char/0178/index.htm

Use //echo -a $chr(376) instead (376 is 0178 in base-10).


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Oct 2003
Posts: 214
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 214
As mIRC is a Windows Application it should take care of the Microsoft Codepage 1252.

And mIRC did this before thats why the results of $utfencode() differs in 6.35 to 7.02.

Quote:
The official CP1252<->Unicode conversion table is printed in the Unicode 2.0 standard for instance, and is available on <ftp://ftp.informatik.uni-erlangen.de/pub/doc/ISO/charsets/> in the file ucs-map-cp1252. [Now see the file ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT at the official Unicode site.]


one step closer to world domination
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
mIRC does not handle CP1252 (or any codepages) anymore. This is an intentional design choice for Unicode compatibility. You should no longer be using any codepages in mIRC.

There is also no rule that says to be a Windows application it must handle CP1252. When your web browsers are set to UTF-8 content-types, they do not handle CP1252. Certainly web-browsers are Windows applications.

It should be noted that any CP1252 support in Windows applications is likely legacy only.

And no, the reason $utfencode differs is because you're attempting to $utfencode already encoded data; all data in mIRC has already been encoded in UTF-8, so calling $utfencode will double-encode the data. It would be equivalent to the following code in 6.35:

//echo -a $utfencode($utfencode(A))

You should never use $utf* functions in 7.x, full stop.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Oct 2003
Posts: 214
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 214
Originally Posted By: argv0
And no, the reason $utfencode differs is because you're attempting to $utfencode already encoded data; all data in mIRC has already been encoded in UTF-8, so calling $utfencode will double-encode the data. It would be equivalent to the following code in 6.35:

//echo -a $utfencode($utfencode(A))

You should never use $utf* functions in 7.x, full stop.


You are absolutly wrong. If i would encode it twice the result would be the same all the time, so i would just "encode" chr 194 all the time and this should end up in char 195 and 130 for all of them. All characters (128-159) start with the char 194 and then increase from 128 to 159 in unicode as you can see on the screenshot in my post above. I'm not saying that this is wrong, i'm just saying that mIRC should be compatible as it was in all versions before this mess started.


one step closer to world domination
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
Originally Posted By: Sephiroth_
So then it's the editbox or what? If i type ALT+0159 i can see Ÿ (Latin Captial Letter Y with diaeresis) and when i send it or echo it, it disappears. Or better it becomes a control code.

When you type Alt+0159, Windows (and not mIRC) produces the Unicode equivalent of the char from the codepage that corresponds to your current input (keyboard) language. The reason you may sometimes see a block is probably because you are not using a Unicode font (such as Arial Unicode MS or Fixedsys Excelsior): in that case, fontlinking kicks in, grabbing a symbol (glyph) from the first font it finds in your system that claims to support that character. In practice this does not always work, so sometimes you get a block. Note that this will never happen if you switch to a Unicode font.

The aforementioned behaviour of Alt+NNNN can be handy in this case. For example, if your old scripts use a hardcoded char 159, you can simply delete it and hit Alt+0159 in the Editor to get the Unicode version of this char.


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Quote:
i'm just saying that mIRC should be compatible as it was in all versions before this mess started.


The mIRC 7 beta is specifically *not* meant to be compatible with non-unicode data. This is why Khaled put out a beta, so people could test and get accustomed to the new behaviour before the new version is released. mIRC7 will not be compatible with 6.35 with regards to handling of Unicode, and, for the last time, the behaviour of $utfencode is undefined in mIRC7, so it should not be used. For all intents and purposes, $utf* identifiers do not "exist" in 7.

Please read the explicit disclaimer about codepage support on the beta page: http://www.mirc.com/beta.html


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Oct 2003
Posts: 214
S
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Oct 2003
Posts: 214
After i did /font and pressing [ok] this 'problem' was solved, i used the mirc.ini from 6.35 to test if my scripts are almost fully compatible with 7.x and a simple upgrade method.

In mIRC 6.35 the [fonts] entry was "fstatus=Lucida Console,412,0,1" after that it changed to "fstatus=Lucida Console,412,0,2,0" and everything worked fine.
The chr 159 was not 'transforming' to a control code anymore. (same for all other characters from 128-159)


one step closer to world domination
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
In mIRC v6.35, if you have set your window font to use eg. a Greek script/codepage and you then use $utfencode(text) in that window, it will perform the conversion using the Greek codepage. Resetting the fonts also reset your script/codepage for that font, so that is probably why the issue resolved itself.

Also in mIRC v6.35, the characters 128-256 will appear differently to users on IRC depending on their system language or their selected font script in the a window. That means it is not reliable to use this range of characters on IRC - there is no guarantee that users will see the same characters.

That is why mIRC v7.x now uses UTF-8 for all text, to ensure that everyone, everywhere, consistently sees the same characters. In mIRC v7.x, all text is Unicode text. When you use $utfencode() or $utfdecode(), it will convert to/from Unicode/UTF-8. The Unicode/UTF-8 conversion routines check (on a per character basis) if text is already in UTF-8 format and will prevent it from being double-encoded.

Joined: Oct 2004
Posts: 8,330
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,330
If I remember mirc.ini format well enough, setting "2" in your font like you did will enabled the "encode and send" option in 6.35. That is probably related to why it works when you type something. Keep in mind that the change you mention is a change in whether or not it displays in 6.35. You originally were asking about 7.x and not 6.35. If you had said characters disappearing in 6.35 when you typed them (instead of 7.x), we could have given you that answer right away.


Invision Support
#Invision on irc.irchighway.net

Link Copied to Clipboard