|
Joined: Feb 2006
Posts: 546
Fjord artisan
|
OP
Fjord artisan
Joined: Feb 2006
Posts: 546 |
there seems to be a new limitation on recursive matching (?N) in mirc 6.2:
//echo -ag $regex(216.223.2.249,/^(?:(1?\d?\d|2[0-4]\d|25[0-5])\.){3}[color:red](?1)[/color]$/)
match fails due to that (?1) there but it works in mirc 6.17 to help diagnose the problem: [17:42:46] (@Msmo) I can only guess why it's happening [17:43:05] (@Msmo) I'd say it's something like first partial match in recursion becomes atomic [17:43:26] (@Msmo) 2[0-4]\d has a partial match in 250 (the 2 matches the 2) edit: [17:49:12] (@Msmo) ok, looks like they made (?1) atomic [17:49:21] (@Msmo) that's it perhaps an intentional change to make recursive matching less intensive? if so, i don't suppose there's any way you could reconsider changing it back to the way it was, or at least document this change in case others wonder about it :P
"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
|
|
|
|
Joined: Jan 2003
Posts: 2,523
Hoopy frood
|
Hoopy frood
Joined: Jan 2003
Posts: 2,523 |
mirc uses the PCRE library for its regex identifiers, so it has no control over this. This is indeed the source of the problem, all recursive items ((?1) and (?R)) are atomic, as the PCRE changelog for version 6.5 (01-Feb-06) states: 3. A nasty bug was discovered in the handling of recursive patterns, that is, items such as (?R) or (?1), when the recursion could match a number of alternatives. If it matched one of the alternatives, but subsequently, outside the recursion, there was a failure, the code tried to back up into the recursion. However, because of the way PCRE is implemented, this is not possible, and the result was an incorrect result from the match.
In order to prevent this happening, the specification of recursion has been changed so that all such subpatterns are automatically treated as atomic groups. Thus, for example, (?R) is treated as if it were (?>(?R)). This also applies when (?1) is used non-recursively, ie as a subroutine, according to the PCRE manual: Like recursive subpatterns, a "subroutine" call is always treated as an atomic group. That is, once it has matched some of the subject string, it is never re-entered, even if it contains untried alternatives and there is a subsequent matching failure. I agree that it's inconvenient, if not crippling.
/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
|
|
|
|
Joined: Feb 2006
Posts: 546
Fjord artisan
|
OP
Fjord artisan
Joined: Feb 2006
Posts: 546 |
oh haha, thanks! should've figured, i need to start following those updates
"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
|
|
|
|
Joined: Oct 2005
Posts: 1,741
Hoopy frood
|
Hoopy frood
Joined: Oct 2005
Posts: 1,741 |
I can't say that I know why its doing that, but here is what my testing has shown me.
I modified the regex slightly so that I could see what was being matched:
//echo -ag $regex(a,xxx.xxx.xxx.xxx,/^((?:(1?\d?\d|2[0-4]\d|25[0-5])\.){3}(?2)$)/) - $regml(a,1) : $regml(a,2)
If you take out the last $ you can see what the regex is trying to match, and you can see why it fails.
I used these IPs (with the $ removed: 121.122.123.1 displays 1 - 121.122.123.1 : 123 121.122.123.12 displays 1 - 121.122.123.12 : 123 121.122.123.124 displays 1 - 121.122.123.124 : 123 121.122.123.234 displays 1 - 121.122.123.23 : 123 121.122.123.254 displays 1 - 121.122.123.25 : 123
It looks like the 1? at the beginning is what is messing it up. If you take out the ? The above examples will work (with the $ in place). Obviously that will break the ability to match any number under 100, but it demonstrates the problem. I don't know much about atomic grouping, but from what I read, it is possible that (?n) have been made atomic.
The easiest solution I found for that specific regex situation is to rearrange the alternation order. The following code should work with all IPs.
//echo -ag $regex(a,xxx.xxx.xxx.xxx,/^((?:(2[0-4]\d|25[0-5]|1?\d?\d)\.){3}(?2)$)/) - $regml(a,1) : $regml(a,2)
(remove the extra brackets)
//echo -ag $regex(a,xxx.xxx.xxx.xxx,/^(?:(2[0-4]\d|25[0-5]|1?\d?\d)\.){3}(?1)$/)
Edit: obviously the above posts have pinpointed the reason for this change.
-genius_at_work
Last edited by genius_at_work; 23/09/06 11:21 PM.
|
|
|
|
Joined: Sep 2003
Posts: 261
Fjord artisan
|
Fjord artisan
Joined: Sep 2003
Posts: 261 |
I've always used g to match recursivly.
We don't just write the scripts, we put them to the test! (ScriptBusters)
|
|
|
|
Joined: Feb 2004
Posts: 2,019
Hoopy frood
|
Hoopy frood
Joined: Feb 2004
Posts: 2,019 |
Very strange, someone asked me today on IRC about this exact same problem with also IP matching regex code, and I also proposed to him to change the order of the character classes (what g_at_work suggested), now tonight I see this thread, kind of freaky. Do you know this guy Jensen on Dalnet or something?!
Btw the 1?\d?\d will allow for something like 04.05.06.07 are you okay with that? I don't actually know what's really a valid IP address regarding zero padding of digits.
Gone.
|
|
|
|
Joined: Feb 2006
Posts: 546
Fjord artisan
|
OP
Fjord artisan
Joined: Feb 2006
Posts: 546 |
ya thats the obvious workaround here, too bad there doesn't exist a more general solution ;>
lol foptics, no i don't know that guy, that is quite a strange coincidence!
Scorpwanna, that has nothing to do with what we're discussing here :P
Last edited by jaytea; 24/09/06 01:28 AM.
"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
|
|
|
|
Joined: Mar 2004
Posts: 210
Fjord artisan
|
Fjord artisan
Joined: Mar 2004
Posts: 210 |
Btw the 1?\d?\d will allow for something like 04.05.06.07 are you okay with that? I don't actually know what's really a valid IP address regarding zero padding of digits. That's only a human readable representation of a long integer. I doubt that you'll actually get zero padding, unless the IP is typed by hand. (Arithmetic doesn't have a concept of "zero padding".)
|
|
|
|
|