mIRC Home    About    Download    Register    News    Help

Print Thread
#260397 17/04/17 04:25 AM
Joined: Aug 2016
Posts: 57
R
Babel fish
OP Offline
Babel fish
R
Joined: Aug 2016
Posts: 57
Hi Khaled.

I noticed that the $regmlex identifier returns 0 instead of null for a group that has not captured anything. I believe that this behavior is not the most appropriate.

Example:
Code:
//noop $regex(test a regmlex_identifier,/([^\s]+)?(regmlex)([^\s]+)?/) | echo -a Number of groups captured: $regmlex(1,0) - Group 1: $regmlex(1,1) - Group 2: $regmlex(1,2) - Group 3: $regmlex(1,3)

Will print: Number of groups captured: 2 - Group 1: 0 - Group 2: regmlex - Group 3: _identifier

Example 2:
Code:
//noop $regex(test a 0regmlex_identifier,/([^\s]+)?(regmlex)([^\s]+)?/) | echo -a Number of groups captured: $regmlex(1,0) - Group 1: $regmlex(1,1) - Group 2: $regmlex(1,2) - Group 3: $regmlex(1,3)

Will print: Number of groups captured: 3 - Group 1: 0 - Group 2: regmlex - Group 3: _identifier

It may not be a bug, but just the way you've decided to treat it. However, from my point of view, if it is not a bug, the return value would have to be null for an uncaught group.

Last edited by rockcavera; 17/04/17 04:32 AM.

rockcavera
#Scripts @ irc.VirtuaLife.com.br
rockcavera #260493 29/04/17 10:42 AM
Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Thanks for your bug report. This does indeed look like a bug - it should have been returning $null in this case, as otherwise there would be no way to distinguish between no capture and a capture. This issue has been fixed for the next version.

rockcavera #260875 29/06/17 01:27 PM
Joined: Aug 2003
Posts: 319
P
Pan-dimensional mouse
Offline
Pan-dimensional mouse
P
Joined: Aug 2003
Posts: 319
This may or may not be part of the same bug, but regml(N) gives the N'th non-empty group rather than the N'th group.

Code:
//noop $regex(test regml,/(test)\s(a\s)?(regml)/) | echo -a Number of groups captured: $regml(0) - Group 1: $regml(1) - Group 2: $regml(2) - Group 3: $regml(3)


results in

Code:
Number of groups captured: 2 - Group 1: test - Group 2: regml - Group 3:


when it should be

Code:
Number of groups captured: 3 - Group 1: test - Group 2: - Group 3: regml

Protopia #260876 29/06/17 01:33 PM
Joined: Jul 2006
Posts: 4,144
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,144
Yes, this is not related, mIRC was doing it wrong for all these years but this has been fixed recently, you have to use the custom /F switch to enable the correct behavior regarding non participating capturing group.


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Wims #260877 29/06/17 01:37 PM
Joined: Aug 2003
Posts: 319
P
Pan-dimensional mouse
Offline
Pan-dimensional mouse
P
Joined: Aug 2003
Posts: 319
Help refers to "back references" so I didn't try the F flag, but indeed it does seem to fix it!!

Excellent - I can now switch from my cludge alias fix to native $regml.

Protopia #260879 29/06/17 01:45 PM
Joined: Jul 2006
Posts: 4,144
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,144
Yes the description is not that great and I was going to ask for some update.
Quote:
F - make back-references refer to () capture groups, which is how standard regex works.
seems to indicate that before that, backreference were not referring to the () capture group, but they were. The problem before was that mIRC was ignoring non participating capturing group. I think this is what /F should mention, but maybe it was said that way to keep it simple, but really I would like to see:

Quote:
F - Stop ignoring non participating capturing group, which is how standard regex works

Because the current description is complete non sense actually, backreference always refer to capture group.

The Note: about /F in the help file is, the same way, a bit wrong I think.
First, there is a difference between an empty match and a non participating capturing group
/([a-z]*)/ on "2" is creating an empty capturing group, which mirc handled and is still handling correctly regardless of the /F switch, but /([a-z])*2/ on "2" is creating a non participating capturing group, which is ignored without the F switch, a value for it would be $null, but it's changing the total number of capture overall $regml(0) (or $regmlex), which is more problematic.
Second, mIRC has no control as to how \N backreferences are used/handled in the pattern, backreference in the pattern should work the same regardless of /F.
Only N indexes in identifier such as $regml is going to change.

So maybe it should says
Quote:
If the F modifier is not used, non participating capturing group are ignored.
but that's very redundant.

As shown here, when you get a non participating capturing group, you get -1 -1 for the offset/position.
I'm not too sure how useful it is to know if a capturing group participated to the match but maybe a new property could be added to $regml/$regmlex.





Last edited by Wims; 29/06/17 02:09 PM.

#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Wims #260880 29/06/17 01:53 PM
Joined: Apr 2004
Posts: 871
Sat Offline
Hoopy frood
Offline
Hoopy frood
Joined: Apr 2004
Posts: 871
Speaking of helpfile issues with the 'F' modifier documentation: the note at the bottom of the page mentions "\N back-references in patterns". This is incorrect. Instead of "patterns", it should say "subtexts" or something like that. Backreferences in patterns are dealt with by PCRE itself, and always worked correctly anyway. If anything, the 'F' modifier synchronizes mIRC's side of things (subtext+$regml*) with what PCRE does (in the pattern).


Saturn, QuakeNet staff
Sat #260881 29/06/17 02:10 PM
Joined: Jul 2006
Posts: 4,144
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,144
Yes I edited my post while you were writting yours smile


#mircscripting @ irc.swiftirc.net == the best mIRC help channel

Link Copied to Clipboard