mIRC Homepage
Posted By: Thrull Help with Unicode in Regex - 29/10/11 04:39 AM
Is there a way, using regex, to catch a set of unicode characters? like everything from È to Ĭ (alt-200 to alt-300)?
Posted By: Wims Re: Help with Unicode in Regex - 29/10/11 01:30 PM
I feel like you are not looking for this solution but is there a problem with $regex(aÈcatchĬb,/È(.*?)Ĭ/) $regml(1)
Posted By: Sat Re: Help with Unicode in Regex - 29/10/11 02:49 PM
Yes, you can use the (*UTF8) modifier in the regex. The following example will return 1 because the given character is in the given range:

Code:
$regex(É,/(*UTF8)[È-Ê]/g)

Without (*UTF8), it does not work as intended: it returns 2 to indicate it (more or less accidentally) found two matching extended-ASCII bytes.
Posted By: Thrull Re: Help with Unicode in Regex - 30/10/11 05:24 AM
Wims: I meant "to" meaning everything from alt-200 to alt-300 (alt-201, alt-202, alt-203, etc). But yes, I can see where it could be misunderstood.

Sat: Thanks, that's what I was looking for. It is still causing issues, but I think I can work around them.

© mIRC Discussion Forums