mIRC Homepage
$regsubex() is unmistakably more powerful than $regsub() with respect to the type of substitutions possible. However, one crippling drawback with $regsubex() is that its results cannot be stored directly back into a %variable the same way it is with $regsub() and so any consecutive spaces are collapsed. The only work around is to constantly $replace($regsubex(),$chr(32),$LF) and then somehow try to manipulate $LF back into $chr(32) later on, often with a $regsub() command, or a combination of /bset and /breplace. Or the following example:

var %parseline
noop $regsub(a,$regsubex(b,$parseline,/pattern/g,subtext),,,%parseline)


Where the outer $regsub() is being used for no other purpose than to assign the output of $regsubex() to the variable %parseline before its multiple consecutive spaces get collapsed by mIRC's parser. Maybe $regsubex() could support its own output variable.

It'd also be extremely fancy if more such string handlers could support &binvars. PCRE already supports binary data. wink
Originally Posted By: Raccoon
[..] and so any consecutive spaces are collapsed.

This is simply not true:

Code:
//var %x = $regsubex(abc,/./g,$chr(32)) | echo -ag length: $len(%x)

length: 3
Huh. Well egg on my face.

Though having $regsubex() output the match count, as $regsub() does, would be useful to know for if-conditions and whether any substitutions were made.

Additionally, the ability to input/output binary variables directly to and from $regsub() and $regsubex() would make using /parseline and /sockwrite a whole lot easier. It's often unacceptable for these commands to collapse spaces, which means a lot of &binvar use, which means a lot of switching back and forth between binary variables and space-placeholder text in order to make any string manipulations.

Code:
  var %parseline = $regsubex($parseline,...)
  var %p = $replace(%parseline,$chr(32),$chr(01))
  bset -t &b 1 %p
  breplace &b 01 32
  parseline -ibn &b
Quote:
Maybe $regsubex() could support its own output variable.

The only issue is that because the name parameter is optional in $regsubex(), adding a new parameter means that mIRC will not be able to tell which parameters you have specified. There are several options:

1) Make it work like $regsub(), ie. output to %var and return N, on condition that the name parameter is always specified. In cases where you are using the default regex result, you would need to specify "default" as the name.

2) Add a property .var that makes it assume that the last parameter is an output %var and return N. An odd use of a .property as a behaviour modifier but not unprecedented.

3) Add a new $regsubex2() identifiter.
Objectively, I'd say that option 1 is the most intuitive. With respect to defining 'default', someone could also pass an empty parameter, like we already do for some identifiers to assert parameter positioning. $regsubex(,text,pattern,var)

Code:
  if     ($regsubex(,$parseline,/regex/g,\2\1,%p)) { noop }
  elseif ($regsubex(,$parseline,/regey/g,\2\3,%p)) { noop }
  elseif ($regsubex(,$parseline,/regez/g,\3\1,%p)) { noop }
  else   { return }
  parseline -ot %p

Question: Would you be adding support for &binvars if I asked pretty please? whistle blush
Originally Posted By: Khaled
There are several options:

I'd like to add:

4) add a parameter to $regsub() to make its subtext parameter behave like that of $regsubex()

After all, $regsub() already has the right overall syntax for this feature suggestion; it just doesn't behave exactly like $regsubex() does. Changing $regsubex() to return a different kind of value based on a parameter or property is a bit weird I think.
I think it's important to pause and think here. I believe Raccoon suggested that $regsubex gets this feature from $regsub because he thought it couldn't preserve spaces otherwise, not for any other reasons, so changing it to work like $regsub would not solve the 'space problem'.
There is one thing left which one could argu about: there's no way to know how many matches/replacements were made in $regsubex. And it's not everyday that you want to know the number of matches when replacing with regex.


$regsub does that, so it's possible to extend $regsub to work more like $regsubex but that seems wrong; why improving the old $regsub?

I'm not a fan of adding a new identifier just for getting the number of replacement made either (although $regsubexed(optional_name) like a $filtered, sounds more attracting than improving $regsub to me).

To me, these solutions are not really great because they illustrate some issues in the current regex functions/implementations. For example there is no way to get the full match, which I think is an important lack of informations, it may be available using a capturing group on the whole pattern but this may not be doable in some situations.

Perhaps $regmlex() could be including a .full property as well as a .matches property:

Code:
noop $regsubex(name,ab-cd-de,/[a-z]+/g,)

 $regmlex([name],[N]).full   
 ;assume N = 1 if not specified, return the full match for the Nth match, here "ab"

 ;
 $regmlex([name]).matches    -return the number of matches, here 3
Or, just keep it simple and make $regsub() and $regsubex() take the same parameters -- the only difference being that the EXtended version (aka $regsubEx) will evaluate the subtext, whereas $regsub does not evaluate the subtext. We might even make the %variable parameter in $regsub() optional and also return OutputText instead of N when %var is omitted.

That way it's easy and straight forward to explain in the help file, for new users to understand and grasp. I'll even help author the section of /help to make it nice.
That's one solution but..

$regsub is old and ugly, nobody wants to see this guy when you have the fresh and handsome $regsubex.

$regsubex wasn't added just like that, no, it was added as a nicer version of $regsub.
Now you're telling me you want the nicer version to becomes exactly like the one it was superseding in the first place? I'm not a fan of this logic. It was made clear that you do not need this for space preservations, now the only 'valid' concern is getting the number of matches, which should be addressed, changing $regsubex's behavior to address this lack seems like a very wrong solution to me.
Making change because it's easier to update the help file? Still not my logic, but this is irrelevant here.

I didn't mean to hijack the thread with my proposed change but I think my solution is just best.
o.O crazy
It sounds to me like the goal here is to gather matchtext into an output location, whether it be a variable or newly supported binvar.

If that's the suggestion, $regsub() should be the identifier to improve. $regsub() is built for that specific use case. This is in contrast with $regsubex(), which is explicitly _not_ built for that use case. Supporting &binvars as an output to $regsub() would be trivial, I'd imagine.

If the remaining issue is that $regsub() does not parse the subtext parameter in the same way, adding a switch would easily solve this.

I would simply propose:

1. Add support for &binvar in the output parameter of $regsub()
2. Add a switch (say, 'e') to $regsub() that parses subtext in "extended" mode-- i.e., the way that $regsubex() does.

Unless I'm missing something here, this should address all issues, including any issues with space preservation that may or may not be there.
I think it's really splitting hairs in context of design and fashion. It's my opinion that the two functions should look and breath similarly, with their only difference being the way in which either function evaluates its substitution text, and NOT in the parameters which they use. That way, you can alternate between one and the other, in practice and use, without having to refer to the help file to figure out how their parameters differentiate -- because they'd be the same parameters.

This goes in line with $replace() vs $replaceEx(). And so too $regsub() and $regsubEx().


It is pretty normal for mIRC string functions to return Text or Number depending on the parameters provided to the function. I'm not really sure where the apprehension is coming from in this regard.

"Performs a regular expression match, like $regex(), and then performs a substitution using subtext. If an optional variable is supplied as the last parameter, the text will be stored in that variable and the functions will return the number of substitutions made to the string, if any."
I extended $regsub() and $regsubex() in the latest beta, however in order to maintain backwards compatibility, I made as few changes as possible to both - $regsubex() in particular will need testing to make sure it doesn't break anything.
Another request.

Allow input parameter to be a &binvar, if and only if, the output parameter is also &binvar.

$regsubex(name,&binvar,/pattern/,sub,&binvar)

Because there is currently no way to easily modify the contents of a binary variable without...

Code:
breplace &binvar 00 255
var %string = $bvar(&binvar,1-).text
noop $regsubex(,%string,/pattern/,sub,&binvar)

and this assumes that it's safe to replace $chr(00) to $chr(255).

Since $regsubex will only accept an input as a binary variable when there's an output binary variable, this should be perceivably backwards compatible for unlikely input of literal "&text" that's not intended to be a binary variable.

Though, my personal belief is that if someone really needs to pass "&text" literally to regex then they can place it inside a %variable first. So if you want, you could make it so all $regex functions are able to accept &binvar inputs naturally, without requiring a &binvar output too. Including $regex() which has no output variable parameter.
As $regsubex is meant to return a string, I'd suggest extending, the now less used, $regsub

The reason is that $regsub already takes an output variable, just seems logical to extend upon that functionality
Code:
$regsub(name, &input, /pattern/, sub, &output)
Originally Posted By: FroggieDaFrog
As $regsubex is meant to return a string, I'd suggest extending, the now less used, $regsub


You'll note that earlier within this thread, we accomplished the addition of an output variable for $regsubex, too. It's now in the mIRC help file.
Originally Posted By: Sat
Originally Posted By: Raccoon
[..] and so any consecutive spaces are collapsed.
This is simply not true:
Code:
//var %x = $regsubex(abc,/./g,$chr(32)) | echo -ag length: $len(%x)
length: 3

Hey Sat. I figured out why I made this faux pas. Because '/bset -t &binvar 1 %string' doesn't preserve spaces, and $regsub didn't support &binvar either. I just articulated this wrong.
© mIRC Discussion Forums