mIRC Homepage

Regex anchors broken with //m (multi-line)

Posted By: Talon

Regex anchors broken with //m (multi-line) - 12/11/21 04:27 AM

I'm on mIRC v7.67, Windows 10 64bit.

when using anchors "^" "$" in multi-line mode, the anchors change from start/end of string, to start/end of line. In the example below, "$" is still treated as end of string, but "^" behaves correctly, both these expressions should pass, the first fails, the second passes.

//var %str = $+(abc,$crlf,def,$crlf,ghi) , %pat1 = /^def$/m , %pat2 = /^ghi$/m | echo -a $regex(%str,%pat1) vs $regex(%str,%pat2)
Posted By: Wims

Re: Regex anchors broken with //m (multi-line) - 12/11/21 12:18 PM

This is not a bug, mirc use pcre with its default line sequence of $lf, this has bien discussed not too long ago with khaled asking if he should make it $crlf.

Nobody answered that question positively, which is good, we dont actuelly want crlf, lf is default and is more compatible, it's just a thing to be aware of
Posted By: Loki12583

Re: Regex anchors broken with //m (multi-line) - 12/11/21 12:23 PM

Works correctly with $lf instead of $crlf
Posted By: Talon

Re: Regex anchors broken with //m (multi-line) - 12/11/21 02:46 PM

The problem was reading files, windows encoding uses \r\n for newlines on files, hence why the example showed using $crlf to mimic a file-read.

Diving further into PCRE, there are various pattern modifiers to force mode flags, like (*UTF), (*UTF8), etc.... I guess there's also one for triggering any sequence of newline encoding (windows \r\n for instance) For anyone else experiencing this issue, the way to do it is (*ANYCRLF)

Example:

//var %str = $+(abc,$crlf,def,$crlf,ghi) , %pat1 = /(*ANYCRLF)^def$/m , %pat2 = /^def$/m | echo -a Success! $regex(%str,%pat1) Fail: $regex(%str,%pat2)
© 2022 mIRC Discussion Forums