mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: Feb 2003
Posts: 2,812
Raccoon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
One of the trying things about regular expression capture groups is having to iterate through an array of $regml() and $regmlex() results and tediously dump them into %variables. This can take up a rather bulky amount of code the more complex a pattern gets with more items you wish to capture.

I would like to propose adding support for PCRE's named capture groups, and translate that capture information directly into local variables. Examples:

Named Capture Group Regex Syntax

Code:
noop $regex(string,/(?<foo>.*)/)   (PCRE 7+)
noop $regex(string,/(?'foo'.*)/)   (PCRE 7+)
noop $regex(string,/(?P<foo>.*)/)  (PCRE old)

compared to

noop $regex(string,/(.*)/)

In the 3 above examples, the local variable %foo would be defined and populated with the captured string.

$regml/ex() would not be necessary to retrieve the data.

(For those keeping score, there's no necessary change to what regex patterns mIRC supports or backwards compatibility, since PCRE already handles and interprets all of this on its own. mIRC just doesn't make use of it at this time.)


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Joined: Jul 2006
Posts: 4,145
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,145
I suggested accessing named capture groups myself: https://forums.mirc.com/ubbthreads.php/topics/259456/Retrieving_named_captures_with#Post259456
While your idea may be good, I wouldn't want to be forced to assign the value to a local variable to access it.
If the problem is with accessing $regml(ex) itself rather than wanting support for named capture groups, I think it would be better to improve $regmlex to support a @window/command parameter like $hfind, which makes it better for looping over capture groups.


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Joined: Feb 2003
Posts: 2,812
Raccoon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
It's not a "problem" accessing $regml(), it's tedious and utterly line consuming. I don't know that there would be fair objection to automatically storing the data to a local variable, seeing as you can choose the names yourself. It's super value-added not having to pull the data from $regml() which just seems like a crutch of a function compared to any other programming language that uses capture groups.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Joined: Jul 2006
Posts: 4,145
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,145
$regmlex is meant to be an improved/fixed version of $regml.

All identifiers which return a Nth item in a collection has to be looped over to get all the items, this is how it works in others languages, but these languages certainly have their own way to handle this better.
But if anything, this would be a problem for you for all identifiers like that (except for $hfind and a few others) since they all require the same number of lines.

Quote:
I don't know that there would be fair objection to automatically storing the data to a local variable, seeing as you can choose the names yourself
it simply breaks backward compatibility:

Code:
alias test {
var %a 5
noop $regex(a,(?<a>.*))
echo -a %a
}
would then echoes "a" instead of "5".


I believe you're only looking for a way to make your life easier by suggesting something about a different feature here, I'm not a big fan.
However, like I said, I still think we should have something like

noop $regex(test,string,/([a-z])([a-z])/g)
noop $regmlex(test, 0, 0,echo -a match number: $1 - capture number: $2)

You would just use /var instead of echo to create your local variable accordingly, maybe $3 could be the name of the capture if any, if support for named captures is added.


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Joined: Mar 2008
Posts: 93
B
Babel fish
Online
Babel fish
B
Joined: Mar 2008
Posts: 93
I do like named captures (and in fact, prefer them) over trying to count brackets to figure out which match I actually want. $regml(a) (or $regmlex) sounds good enough to me (but probably conflicts with the name parameter in some way); %a isn't strictly necessary (and, as mentioned, is very likely a breaking change).

Joined: Jul 2006
Posts: 4,145
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,145
Currently $regml(a)/$regmlex(1,a) = $regml(0)/$regmlex(1,0) with 'a' being interpreted as invalid, so 0. This could be changed to returns the capture if the name matches, and keep interpreting as 0 otherwise.


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Joined: Feb 2003
Posts: 2,812
Raccoon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
I think clearly $regml() support would require the name parameter as non-optional.

But I still stand by my request for automatically populated variable names by design. There's no broken backward compatibility because there's nothing here that anybody already uses.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Joined: Jul 2006
Posts: 4,145
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,145
Quote:
I think clearly $regml() support would require the name parameter as non-optional.
Agreed

Quote:
But I still stand by my request for automatically populated variable names by design. There's no broken backward compatibility because there's nothing here that anybody already uses.
That cannot be added as it breaks scripts, previous scripts which currently use named captures in their expression would now get local variable created, or modified, the latter being more problematic. I really think the way to improve identifier like this is using the command parameter from $hfind/$findfile etc.


#mircscripting @ irc.swiftirc.net == the best mIRC help channel
Joined: Feb 2003
Posts: 2,812
Raccoon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
If $regex* doesn't currently support named capture groups, it's pretty safe to suggest that sane people aren't using named capture groups in scripts.

While in contrast, if named capture groups would populate variables of the same names, it's very safe to say people would use this feature... enthusiastically.

$regml would and should fall into disuse for the most part.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Joined: Jul 2006
Posts: 4,145
W
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 4,145
Named capture group is part of the pcre engine, if you can create a group (non-named) and reuse it inside the expression aka /(a)\1/ matching "aa", you can do that with named captures as well /(?<group>a)\k'group'/
This works in mIRC currently and you don't need to reference the group with \k for the group to exist, so:
Quote:
If $regex* doesn't currently support named capture groups, it's pretty safe to suggest that sane people aren't using named capture groups in scripts.
in that sense, $regex *does* support named capture groups, sane people are using named capture groups in scripts. That's why your suggestion breaks compatibility.
Quote:
if named capture groups would populate variables of the same names, it's very safe to say people would use this feature... enthusiastically.
The problem is not about if people would use the feature or not, the problem is that the suggestion changes how previous script will work, it's not a feature that you can use or not, you're forcing the user to get the feature. What you call safe is actually very unsafe to me since it just breaks compat.

Wanting mIRC to automatically translate named group (but even just normal group) to local variable, I can understand that people would want to do that but.. it's already possible to do that in mIRC, but you called the current way to do it "tedious and utterly line consuming", that's basically calling any script looping over a list of result like that tedious and utterly line consuming, so it should be an issue for you with mIRC as a whole and not just here.

It's extremely slow to loop over $hfind and $findfile with a while loop, that's why they have a command parameter, others identifiers returning collections like that do not have this feature, seems clear to me the way to improve this is to extend $regmlex (and $regml maybe) to support a command parameter and then do the call to /var there by yourself, which should be very close in speed to what mIRC would be doing internally and would not be breaking compat!
Of course this would be improvig $regml* themselves, you would still need support for accessing named groups with mIRC, outside of the expression.


#mircscripting @ irc.swiftirc.net == the best mIRC help channel

Link Copied to Clipboard