PCRE's Special Start-of-Pattern Modifiers
- (*UTF) is a generic way to treat the subject as a UTF stringódetecting whether it should be treated as UTF-8, UTF-16 or UTF-32.
- (*UTF8), (*UTF16) and (*UTF32) treat the string as one of three specific UTF encodings.
- (*UCP) which stands for Unicode Character Properties and allows unicode characters to match digit and word characters.
What UCP means is that international scripts will be interpreted as letters and numbers matched by the \w and \d classes.http://www.rexegg.com/regex-modifiers.html#pcre
The PCRE /u flag that mIRC supports is a combination of both (*UTF8) and (*UCP).
mIRC also supports the flag /W which is just (*UCP) and kind of useless on its own.
But mIRC does not support /8 which is (*UTF8) support without the (*UCP) behavior.
Could mIRC please include support for /8 as a pattern modifier flag?
I think most users would intend to use /8 over /u in their pattern modifiers if it were made available and documented better.