|
Joined: Jul 2006
Posts: 4,222
Hoopy frood
|
OP
Hoopy frood
Joined: Jul 2006
Posts: 4,222 |
As I mentioned in this thread, we don't have access to the full match and its position made by the pcre engine, it's not that useful most of the time but in some cases it can. I'd use it for a bot, to show the capturing group and what is matched, something like !regex /b+a.*c/ bbbbadddc would be shown as bbbbadddc for example. My proposed syntax in that link is not so great though, it doesn't allow us to get the position of the full match. So I suggest $regmlex([name],M,[N],full) to return the full match, where the .pos property could be used.
#mircscripting @ irc.swiftirc.net == the best mIRC help channel
|
|
|
|
Joined: Jun 2008
Posts: 6
Nutrimatic drinks dispenser
|
Nutrimatic drinks dispenser
Joined: Jun 2008
Posts: 6 |
I also support this idea.
|
|
|
|
Joined: Feb 2006
Posts: 546
Fjord artisan
|
Fjord artisan
Joined: Feb 2006
Posts: 546 |
i also add my support because it can be useful and scripting this to work in the general case is impossible difficult but possible! i finally found a way to do this that doesn't involve completely parsing the expression
; $regexm(<string>, <regex> [, N])[.pos]
; Get the Nth value of match[0] (defaults to N = 1)
; Return its position with .pos
alias regexm {
; parse the full regex as mIRC does
noop $regex(full, $2, /^(?|m(.)?|(/)|^)(.*?)(?:\1(?!.*\1)(.*)$)?$/usD)
; isolate PCRE options from the start of the expression
noop $regex(pcre, $regmlex(full, 1, 2), /^((?:(?!\((?:(?:MARK|PRUNE|SKIP|THEN|\*(?=:))(?::[^()]*)?|ACCEPT|COMMIT)\))\(\*.*?\))*)(.*)/us)
; validate expression
if (!$regex( , $+(/, $regmlex(pcre, 1, 1), |, $regmlex(pcre, 1, 2), /))) {
echo -eagc i * $!regexm: Invalid expression ( $+ $regerrstr $+ )
return
}
var %char, %exp
; find suitable placeholder char
while ($chr($r(2048, 55295)) isin $1) /
%char = $v1
; add \K onto end of given expression
; first wrap entire expression in (?: ), adding \E in case of unterminated \Q
%exp = $+(m, $regmlex(full, 1, 1), $regmlex(pcre, 1, 1), (?: $+ $regmlex(pcre, 1, 2) $+ \E)(?(R)|\K), $regmlex(full, 1, 1), $regmlex(full, 1, 3))
; construct a second expression by transforming the result of subbing the placeholder char where the matches occurred
; run this against the result of subbing the placeholder into the matches of the \K-modified expression above
; and hey presto, you can find the matches with no ambiguity
noop $regex(final, $regsubex($1, %exp, %char), $+(/\Q, $replace($regsubex($1, $2, %char), \E, \E\\E\Q, %char, \E(.*?)\Q $+ %char), \E/u))
; if it's a position we seek, we need to subtract all placeholders factored into the result
if ($prop == pos) return $calc(1 + $regml(final, $3 1).pos - $3 1)
returnex $regml(final, $3 1)
}
example usage:
//var -s %str = babababababa, %re = /ba(?=ba$)|baba(?=bababa$)/g | echo -eag $regsubex(%str, %re, X) - $regexm(%str, %re, 1).pos - $regexm(%str, %re, 2).pos not thoroughly tested but i just wanted to get you your Christmas present quickly  the basic principle is to modify the expression slightly by adding \K on the end. then compare the result of substituting normally vs. with \K. that will tell you all you need to be able to figure out exactly which substring(s) matched!
"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
|
|
|
|
Joined: Feb 2006
Posts: 546
Fjord artisan
|
Fjord artisan
Joined: Feb 2006
Posts: 546 |
sorry, glaring oversight: that $replace() on the 4th last line should of course be $replacecs()
"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
|
|
|
|
Joined: Jul 2006
Posts: 4,222
Hoopy frood
|
OP
Hoopy frood
Joined: Jul 2006
Posts: 4,222 |
@jaytea: I finally found the time to put this alias to use, but I quickly found an issue : //echo -a $regexm(eaze ezrrazer zr5ze45ra5z5 t,/((\w+)\S+)*/Fg,1)
returns nothing The debug shows that the final pattern in that code: $+(/\Q, $replacecs($regsubex($1, $2, %char), \E, \E\\E\Q, %char, \E(.*?)\Q $+ %char) result in something requiring 8 $chr($r(2048, 55295)) chars, but the input string being tested contains only 5 of them... @Khaled: is it possible to hear from you about this? Is this on your todolist somewhere?
#mircscripting @ irc.swiftirc.net == the best mIRC help channel
|
|
|
|
Joined: Jul 2006
Posts: 4,222
Hoopy frood
|
OP
Hoopy frood
Joined: Jul 2006
Posts: 4,222 |
Also for the suggested syntax, $regmlex([name],M,-1) is a better syntax, to return the fullmatch of the Mth match.
#mircscripting @ irc.swiftirc.net == the best mIRC help channel
|
|
|
|
Joined: Feb 2006
Posts: 546
Fjord artisan
|
Fjord artisan
Joined: Feb 2006
Posts: 546 |
so i forgot to post the solution to this.. turns out the problem was related to the "empty string with //g" issue that plagues PCREv1's demo code. here's the fix:
; $regexm(<string>, <regex> [, N])[.pos]
; Get the Nth value of match[0] (defaults to N = 1)
; Return its position with .pos
alias regexm {
noop $regex(full, $2, /^(?|m(.)?|(/)|^)(.*?)(?:\1(?!.*\1)(.*)$)?$/usD)
noop $regex(pcre, $regmlex(full, 1, 2), /^((?:(?!\((?:(?:MARK|PRUNE|SKIP|THEN|\*(?=:))(?::[^()]*)?|ACCEPT|COMMIT)\))\(\*.*?\))*)(.*)/us)
if (!$regex( , $+(/, $regmlex(pcre, 1, 1), |, $regmlex(pcre, 1, 2), /))) {
echo -eagc i * $!regexm: Invalid expression ( $+ $regerrstr $+ )
return
}
var %char, %exp
while ($chr($r(2048, 55295)) isin $1) /
%char = $v1
var %exp = $+(m, $regmlex(full, 1, 1), $regmlex(pcre, 1, 1), (?: $+ $regmlex(pcre, 1, 2) $+ \E)(?(R)|\K), $regmlex(full, 1, 1), $regmlex(full, 1, 3))
var %str = $regsubex($1, %exp, %char) .
if (!$pos(%str, %char, $regex($1, $2))) {
noop $regex(check, $regsubex($1, $2, %char), / %char ( %char ?)/gxu)
%str = $regsubex(fix, $left(%str, -2), / %char \K/gxu, $regml(check, \n)) .
}
noop $regex(final, $left(%str, -2), $+(/\Q, $replacecs($regsubex($1, $2, %char), \E, \E\\E\Q, %char, \E(.*?)\Q $+ %char), \E/u))
if ($prop == pos) && ($3. isnum 1- $regml(final, 0)) return $calc(1 + $regml(final, $3).pos - $3)
returnex $regml(final, $3 1)
}
"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
|
|
|
|
Joined: Oct 2017
Posts: 47
Ameglian cow
|
Ameglian cow
Joined: Oct 2017
Posts: 47 |
I was looking for the same thing, and wondering why not mIRC doesn't already include $regexm() identifier.
It's pretty useful if it can be included in the next version.
|
|
|
|
|