mIRC Home    About    Download    Register    News    Help

Print Thread
escaped characters in regsub #113483 05/03/05 05:47 PM
Joined: Mar 2003
Posts: 16
T
t0m Offline OP
Pikka bird
OP Offline
Pikka bird
T
Joined: Mar 2003
Posts: 16
hi there,

ive been trying to alter a strings case with regsub and figured it would not work as expected. escaped characters in the 'subtext' part of $regsub are not given their special meaning.
when trying to apply title case (ie All Words Are Capitalized), i used the following:
$regsub(%string,/(\b.)/g,\U\1\E,%string)
\U meaning 'capitalize until \E'. this did not give the expected result, ie uppercase word boundary+following character. whats it does instead is return "U"+\1(the match)+"E".
i dont know if this is a problem with mirc or pcre, or the implementation of pcre within mirc, but id like to see it fixed.

perl -e '($a = "this bug") =~ s/\b(.)/\u\1/g; print "$a\n";'
returns This Bug, which is correct. the regex i used is not exactly the same, but s/(\b.)/\U\1\E/g would work as well.

for reference, see http://www.perlpod.com/5.8.4/pod/perlre.html

hope its fixe'able!

Re: escaped characters in regsub #113484 05/03/05 08:32 PM
Joined: Jan 2003
Posts: 2,523
Q
qwerty Offline
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
PCRE itself doesn't support these escape sequences. This is documented in PCRE's manual, more specifically in section DIFFERENCES FROM PERL:
Code:
     4. The following Perl escape sequences  are  not  supported:
     \l,  \u,  \L,  \U,  \P, \p, and \X. In fact these are imple-
     mented by Perl's general string-handling and are not part of
     its pattern matching engine. If any of these are encountered
     by PCRE, an error is generated.

However, from what I've seen, there are no 'substitute' facilities in PCRE itself (like Perl's s/re/sub/): $regsub() only uses PCRE for pattern matching and capturing. The substitutions, as well as the meaning of special chars and sequences in <subtext>, are handled by mirc itself. So I guess your report can be viewed as a feature suggestion; support for these escape sequences. In my opinion (and others scripters'), a more flexible solution for mirc would be the ability to pass \1 in <subtext> to mirc identifiers. This way you would be able to use $regsub(string,/\b(.)/g,$upper(\1),%var).


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
Re: escaped characters in regsub #113485 05/03/05 08:42 PM
Joined: Nov 2003
Posts: 2,327
T
tidy_trax Offline
Hoopy frood
Offline
Hoopy frood
T
Joined: Nov 2003
Posts: 2,327
Quote:
The substitutions, as well as the meaning of special chars and sequences in <subtext>, are handled by mirc itself.


Are you sure about that? If mIRC handles them i'd of thought you would use the normal method of escaping identifiers ($!identifier(\1)) but instead you have to use \$identifier(\1).


New username: hixxy
Re: escaped characters in regsub #113486 05/03/05 09:00 PM
Joined: Mar 2003
Posts: 16
T
t0m Offline OP
Pikka bird
OP Offline
Pikka bird
T
Joined: Mar 2003
Posts: 16
both solutions look good to me. the implementation of special escapes gets my vote though, but i guess its less likely to happen.
thank you qwerty.

Re: escaped characters in regsub #113487 05/03/05 09:40 PM
Joined: Jan 2003
Posts: 2,523
Q
qwerty Offline
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
The parsing of parameters in <subtext> is the same as with any other mirc identifier: $!identifier still evaluates to $identifier. However, $ is considered a special char in <subtext>: $1 is the same as \1, $2 = \2 etc:
//var %a, %b = $regsub(cd,/(.)/g,A$1B,%a) | echo -s %a
result: AcBAdB
Most probably, this feature has its roots to Perl. To escape $ in subtext, you use \$. Note that the \ in \$ident also prevents mirc from evaluating "$ident" simply because it touches the $, like it does in //echo -s \$me
Also note that
//var %a, %b = $regsub(cd,/(.)/g,A $1 B,%a) | echo -s %a
wouldn't give "A c BA d B" because mirc still evaluates <subtext>, as it does with all identifier params. So in this case, it would try to evaluate $1, which would be the first param passed to the calling routine (eg an alias or an event):
//tokenize 32 TEST | var %a, %b = $regsub(cd,/(.)/g,A $1 B,%a) | echo -s %a
To get it to work like the first example, you need to use $!1.

Last edited by qwerty; 05/03/05 09:48 PM.

/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
Re: escaped characters in regsub #113488 05/03/05 10:32 PM
Joined: Nov 2003
Posts: 2,327
T
tidy_trax Offline
Hoopy frood
Offline
Hoopy frood
T
Joined: Nov 2003
Posts: 2,327
Ah, I remember reading that somewhere else now. smile
I still don't understand why you can't use a $replace() there though, if $1 is evaluated like normal then surely any other identifier should be.


New username: hixxy
Re: escaped characters in regsub #113489 06/03/05 11:03 AM
Joined: Jan 2003
Posts: 2,523
Q
qwerty Offline
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
You can use any identifier, you just can't pass the contents of \1 to it; if you try, the identifier is passed the string "\1", returns a value and then mirc replaces any \1 in it with the PCRE-captured content:

//var %a, %b = $regsub(a,/(a)/,$str(\1,3),%a) | echo -s %a
"aaa"

$str() indeed worked, returning "\1\1\1" (like it would inside any other identifier, e.g. //echo -a $replace($str(\1,3),1,2) ). Then mirc replaced every \1 with the captured "a". What we would all like is \1 to be replaced before the standard evaluation of idents/variables in <subtext> (which would have to be repeated as many times as the number of matches).


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
Re: escaped characters in regsub #113490 06/03/05 11:09 AM
Joined: Feb 2004
Posts: 2,019
FiberOPtics Offline
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
Quote:
What we would all like is \1 to be replaced before the standard evaluation of idents/variables in <subtext> (which would have to be repeated as many times as the number of matches).

That would be such a great addition (as suggested many times before). I really hope Khaled will add that in the next version!


Gone.
Re: escaped characters in regsub #113491 06/03/05 05:00 PM
Joined: Nov 2003
Posts: 2,327
T
tidy_trax Offline
Hoopy frood
Offline
Hoopy frood
T
Joined: Nov 2003
Posts: 2,327
Ah yes I see now, great idea. smile


New username: hixxy
Re: escaped characters in regsub #113492 06/03/05 08:17 PM
Joined: Sep 2003
Posts: 4,230
D
DaveC Offline
Hoopy frood
Offline
Hoopy frood
D
Joined: Sep 2003
Posts: 4,230
Having it \1 etc replaced before evaluation flys in the face of all logical evaluation order, i mean its a parameter thats passed to the procedure, you could always markup the output so the \1 etc return values were easy to identify, placing some type of tags infront and behind etc.

Not that im saying its not a damn fine idea but i thought it should use a new command, to do that, something in the nature of the looping alias callable commands such as filter or findfile.

Re: escaped characters in regsub #113493 06/03/05 08:37 PM
Joined: Dec 2002
Posts: 2,962
S
starbucks_mafia Offline
Hoopy frood
Offline
Hoopy frood
S
Joined: Dec 2002
Posts: 2,962
Quote:
Having it \1 etc replaced before evaluation flys in the face of all logical evaluation order, i mean its a parameter thats passed to the procedure, you could always markup the output so the \1 etc return values were easy to identify, placing some type of tags infront and behind etc.

- So? $findfile() and $finddir() do it already. If someone wants to evaluate identifiers beforehand they can use evaluation brackets. Using 'markup' is very fiddly.


Spelling mistakes, grammatical errors, and stupid comments are intentional.
Re: escaped characters in regsub #113494 06/03/05 08:53 PM
Joined: Jan 2003
Posts: 2,523
Q
qwerty Offline
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
$findfile isn't that different from the proposed behaviour in $regsub. $findfile's [command] parameter is evaluated very differently from the other identifiers already: it's a mini-environment where you can use expressions with $1- etc and where the contained code is evaluated each time a new file is found. This behaviour is (to my eyes) almost identical to the hypothetical $regsub, except two things:
- the parameter wouldn't be treated as a command but as a value to replace certain substrings in the input string.
- $regsub would use \1 instead of $1
These differences are minor details, both from the user's and the developer's perspective; I'm guessing that Khaled wouldn't have to work too hard to make this happen, since he did a similar thing with $findfile.

Edit: too slow once again, starbucks summed it up as I was writing my post

Last edited by qwerty; 06/03/05 08:59 PM.

/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com