mIRC Home    About    Download    Register    News    Help

Print Thread
#165154 22/11/06 05:33 PM
Joined: Apr 2004
Posts: 759
M
Hoopy frood
OP Offline
Hoopy frood
M
Joined: Apr 2004
Posts: 759
Since i don't class myself as a Unicode expert i post this here before moaning its a bug.

Concider these cases for the tahoma font:

//echo -a $regex(£a,/\xA3/g)
pound sign is U+00A3
this matches perfectly

however the € sign uses more then two hexadecimal numbers which doesn't seem to work in mIRC.

//echo -a $regex(£,/\x{A3}/g) $regex(ÿ,/\x{FF}/g) $regex(Ā,/\x{100}/g) $regex(€,/\x{20AC}/g)
returns 1 1 0 0
might have something to do with 2+ hexadecimal digits requiring brackets.
//echo -a $regex(€,/\X/g)
returns 0 eventhough \X tests for

While i am at it does the \p and \P character classes even work ? Or are they not compiled with mIRC?
//echo -a $regex(a,\P{Greek}) $regex(a,\p{Greek})
one of these is bound to return 1 but returns 0 0 anyway.
$regex(£,/\p{Sc}/g)
should match as well since its a currency symbol.


$maybe
#165155 22/11/06 05:51 PM
Joined: Apr 2004
Posts: 759
M
Hoopy frood
OP Offline
Hoopy frood
M
Joined: Apr 2004
Posts: 759
After reading the PCRE more thoroughly i got it myself now.
Quote:

After \x, from zero to two hexadecimal digits are read (letters can be
in upper or lower case). Any number of hexadecimal digits may appear
between \x{ and }, but the value of the character code must be less
than 256 in non-UTF-8 mode, and less than 2**31 in UTF-8 mode


Guess that solves all my answers Regex is not in UTF-8 mode :tongue:


$maybe

Link Copied to Clipboard