mIRC Homepage
Posted By: rockcavera Bug $regmlex - 17/04/17 04:25 AM
Hi Khaled.

I noticed that the $regmlex identifier returns 0 instead of null for a group that has not captured anything. I believe that this behavior is not the most appropriate.

Example:
Code:
//noop $regex(test a regmlex_identifier,/([^\s]+)?(regmlex)([^\s]+)?/) | echo -a Number of groups captured: $regmlex(1,0) - Group 1: $regmlex(1,1) - Group 2: $regmlex(1,2) - Group 3: $regmlex(1,3)

Will print: Number of groups captured: 2 - Group 1: 0 - Group 2: regmlex - Group 3: _identifier

Example 2:
Code:
//noop $regex(test a 0regmlex_identifier,/([^\s]+)?(regmlex)([^\s]+)?/) | echo -a Number of groups captured: $regmlex(1,0) - Group 1: $regmlex(1,1) - Group 2: $regmlex(1,2) - Group 3: $regmlex(1,3)

Will print: Number of groups captured: 3 - Group 1: 0 - Group 2: regmlex - Group 3: _identifier

It may not be a bug, but just the way you've decided to treat it. However, from my point of view, if it is not a bug, the return value would have to be null for an uncaught group.
Posted By: Khaled Re: Bug $regmlex - 29/04/17 10:42 AM
Thanks for your bug report. This does indeed look like a bug - it should have been returning $null in this case, as otherwise there would be no way to distinguish between no capture and a capture. This issue has been fixed for the next version.
Posted By: Protopia Re: Bug $regmlex - 29/06/17 01:27 PM
This may or may not be part of the same bug, but regml(N) gives the N'th non-empty group rather than the N'th group.

Code:
//noop $regex(test regml,/(test)\s(a\s)?(regml)/) | echo -a Number of groups captured: $regml(0) - Group 1: $regml(1) - Group 2: $regml(2) - Group 3: $regml(3)


results in

Code:
Number of groups captured: 2 - Group 1: test - Group 2: regml - Group 3:


when it should be

Code:
Number of groups captured: 3 - Group 1: test - Group 2: - Group 3: regml
Posted By: Wims Re: Bug $regmlex - 29/06/17 01:33 PM
Yes, this is not related, mIRC was doing it wrong for all these years but this has been fixed recently, you have to use the custom /F switch to enable the correct behavior regarding non participating capturing group.
Posted By: Protopia Re: Bug $regmlex - 29/06/17 01:37 PM
Help refers to "back references" so I didn't try the F flag, but indeed it does seem to fix it!!

Excellent - I can now switch from my cludge alias fix to native $regml.
Posted By: Wims Re: Bug $regmlex - 29/06/17 01:45 PM
Yes the description is not that great and I was going to ask for some update.
Quote:
F - make back-references refer to () capture groups, which is how standard regex works.
seems to indicate that before that, backreference were not referring to the () capture group, but they were. The problem before was that mIRC was ignoring non participating capturing group. I think this is what /F should mention, but maybe it was said that way to keep it simple, but really I would like to see:

Quote:
F - Stop ignoring non participating capturing group, which is how standard regex works

Because the current description is complete non sense actually, backreference always refer to capture group.

The Note: about /F in the help file is, the same way, a bit wrong I think.
First, there is a difference between an empty match and a non participating capturing group
/([a-z]*)/ on "2" is creating an empty capturing group, which mirc handled and is still handling correctly regardless of the /F switch, but /([a-z])*2/ on "2" is creating a non participating capturing group, which is ignored without the F switch, a value for it would be $null, but it's changing the total number of capture overall $regml(0) (or $regmlex), which is more problematic.
Second, mIRC has no control as to how \N backreferences are used/handled in the pattern, backreference in the pattern should work the same regardless of /F.
Only N indexes in identifier such as $regml is going to change.

So maybe it should says
Quote:
If the F modifier is not used, non participating capturing group are ignored.
but that's very redundant.

As shown here, when you get a non participating capturing group, you get -1 -1 for the offset/position.
I'm not too sure how useful it is to know if a capturing group participated to the match but maybe a new property could be added to $regml/$regmlex.




Posted By: Sat Re: Bug $regmlex - 29/06/17 01:53 PM
Speaking of helpfile issues with the 'F' modifier documentation: the note at the bottom of the page mentions "\N back-references in patterns". This is incorrect. Instead of "patterns", it should say "subtexts" or something like that. Backreferences in patterns are dealt with by PCRE itself, and always worked correctly anyway. If anything, the 'F' modifier synchronizes mIRC's side of things (subtext+$regml*) with what PCRE does (in the pattern).
Posted By: Wims Re: Bug $regmlex - 29/06/17 02:10 PM
Yes I edited my post while you were writting yours smile
© mIRC Discussion Forums