Register Log In

Forums Bug Reports escaped characters in regsub

Print Thread

escaped characters in regsub #113483 05/03/05 05:47 PM
Joined: Mar 2003 Posts: 16 T t0m OP Pikka bird
OP t0m Pikka bird T Joined: Mar 2003 Posts: 16	hi there, ive been trying to alter a strings case with regsub and figured it would not work as expected. escaped characters in the 'subtext' part of $regsub are not given their special meaning. when trying to apply title case (ie All Words Are Capitalized), i used the following: $regsub(%string,/(\b.)/g,\U\1\E,%string) \U meaning 'capitalize until \E'. this did not give the expected result, ie uppercase word boundary+following character. whats it does instead is return "U"+\1(the match)+"E". i dont know if this is a problem with mirc or pcre, or the implementation of pcre within mirc, but id like to see it fixed. perl -e '($a = "this bug") =~ s/\b(.)/\u\1/g; print "$a\n";' returns This Bug, which is correct. the regex i used is not exactly the same, but s/(\b.)/\U\1\E/g would work as well. for reference, see http://www.perlpod.com/5.8.4/pod/perlre.html hope its fixe'able!

Re: escaped characters in regsub #113484 05/03/05 08:32 PM
Joined: Jan 2003 Posts: 2,523 Q qwerty Hoopy frood
qwerty Hoopy frood Q Joined: Jan 2003 Posts: 2,523	PCRE itself doesn't support these escape sequences. This is documented in PCRE's manual, more specifically in section DIFFERENCES FROM PERL: Code: 4. The following Perl escape sequences are not supported: \l, \u, \L, \U, \P, \p, and \X. In fact these are imple- mented by Perl's general string-handling and are not part of its pattern matching engine. If any of these are encountered by PCRE, an error is generated. However, from what I've seen, there are no 'substitute' facilities in PCRE itself (like Perl's s/re/sub/): $regsub() only uses PCRE for pattern matching and capturing. The substitutions, as well as the meaning of special chars and sequences in <subtext>, are handled by mirc itself. So I guess your report can be viewed as a feature suggestion; support for these escape sequences. In my opinion (and others scripters'), a more flexible solution for mirc would be the ability to pass \1 in <subtext> to mirc identifiers. This way you would be able to use $regsub(string,/\b(.)/g,$upper(\1),%var). /.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com

Re: escaped characters in regsub #113485 05/03/05 08:42 PM
Joined: Nov 2003 Posts: 2,327 T tidy_trax Hoopy frood
tidy_trax Hoopy frood T Joined: Nov 2003 Posts: 2,327	Quote: The substitutions, as well as the meaning of special chars and sequences in <subtext>, are handled by mirc itself. Are you sure about that? If mIRC handles them i'd of thought you would use the normal method of escaping identifiers ($!identifier(\1)) but instead you have to use \$identifier(\1). New username: hixxy

Re: escaped characters in regsub #113486 05/03/05 09:00 PM
Joined: Mar 2003 Posts: 16 T t0m OP Pikka bird
OP t0m Pikka bird T Joined: Mar 2003 Posts: 16	both solutions look good to me. the implementation of special escapes gets my vote though, but i guess its less likely to happen. thank you qwerty.

Re: escaped characters in regsub #113487 05/03/05 09:40 PM
Joined: Jan 2003 Posts: 2,523 Q qwerty Hoopy frood
qwerty Hoopy frood Q Joined: Jan 2003 Posts: 2,523	The parsing of parameters in <subtext> is the same as with any other mirc identifier: $!identifier still evaluates to $identifier. However, $ is considered a special char in <subtext>: $1 is the same as \1, $2 = \2 etc: //var %a, %b = $regsub(cd,/(.)/g,A$1B,%a) \| echo -s %a result: AcBAdB Most probably, this feature has its roots to Perl. To escape $ in subtext, you use \$. Note that the \ in \$ident also prevents mirc from evaluating "$ident" simply because it touches the $, like it does in //echo -s \$me Also note that //var %a, %b = $regsub(cd,/(.)/g,A $1 B,%a) \| echo -s %a wouldn't give "A c BA d B" because mirc still evaluates <subtext>, as it does with all identifier params. So in this case, it would try to evaluate $1, which would be the first param passed to the calling routine (eg an alias or an event): //tokenize 32 TEST \| var %a, %b = $regsub(cd,/(.)/g,A $1 B,%a) \| echo -s %a To get it to work like the first example, you need to use $!1. Last edited by qwerty; 05/03/05 09:48 PM. /.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com

Re: escaped characters in regsub #113488 05/03/05 10:32 PM
Joined: Nov 2003 Posts: 2,327 T tidy_trax Hoopy frood
tidy_trax Hoopy frood T Joined: Nov 2003 Posts: 2,327	Ah, I remember reading that somewhere else now. I still don't understand why you can't use a $replace() there though, if $1 is evaluated like normal then surely any other identifier should be. New username: hixxy

Re: escaped characters in regsub #113489 06/03/05 11:03 AM
Joined: Jan 2003 Posts: 2,523 Q qwerty Hoopy frood
qwerty Hoopy frood Q Joined: Jan 2003 Posts: 2,523	You can use any identifier, you just can't pass the contents of \1 to it; if you try, the identifier is passed the string "\1", returns a value and then mirc replaces any \1 in it with the PCRE-captured content: //var %a, %b = $regsub(a,/(a)/,$str(\1,3),%a) \| echo -s %a "aaa" $str() indeed worked, returning "\1\1\1" (like it would inside any other identifier, e.g. //echo -a $replace($str(\1,3),1,2) ). Then mirc replaced every \1 with the captured "a". What we would all like is \1 to be replaced before the standard evaluation of idents/variables in <subtext> (which would have to be repeated as many times as the number of matches). /.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com

Re: escaped characters in regsub #113490 06/03/05 11:09 AM
Joined: Feb 2004 Posts: 2,019 Leuven, Belgium FiberOPtics Hoopy frood
FiberOPtics Hoopy frood Joined: Feb 2004 Posts: 2,019 Leuven, Belgium	Quote: What we would all like is \1 to be replaced before the standard evaluation of idents/variables in <subtext> (which would have to be repeated as many times as the number of matches). That would be such a great addition (as suggested many times before). I really hope Khaled will add that in the next version! Gone.

Re: escaped characters in regsub #113491 06/03/05 05:00 PM
Joined: Nov 2003 Posts: 2,327 T tidy_trax Hoopy frood
tidy_trax Hoopy frood T Joined: Nov 2003 Posts: 2,327	Ah yes I see now, great idea. New username: hixxy

Re: escaped characters in regsub #113492 06/03/05 08:17 PM
Joined: Sep 2003 Posts: 4,230 D DaveC Hoopy frood
DaveC Hoopy frood D Joined: Sep 2003 Posts: 4,230	Having it \1 etc replaced before evaluation flys in the face of all logical evaluation order, i mean its a parameter thats passed to the procedure, you could always markup the output so the \1 etc return values were easy to identify, placing some type of tags infront and behind etc. Not that im saying its not a damn fine idea but i thought it should use a new command, to do that, something in the nature of the looping alias callable commands such as filter or findfile.

Re: escaped characters in regsub #113493 06/03/05 08:37 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	Quote: Having it \1 etc replaced before evaluation flys in the face of all logical evaluation order, i mean its a parameter thats passed to the procedure, you could always markup the output so the \1 etc return values were easy to identify, placing some type of tags infront and behind etc. - So? $findfile() and $finddir() do it already. If someone wants to evaluate identifiers beforehand they can use evaluation brackets. Using 'markup' is very fiddly. Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: escaped characters in regsub #113494 06/03/05 08:53 PM
Joined: Jan 2003 Posts: 2,523 Q qwerty Hoopy frood
qwerty Hoopy frood Q Joined: Jan 2003 Posts: 2,523	$findfile isn't that different from the proposed behaviour in $regsub. $findfile's [command] parameter is evaluated very differently from the other identifiers already: it's a mini-environment where you can use expressions with $1- etc and where the contained code is evaluated each time a new file is found. This behaviour is (to my eyes) almost identical to the hypothetical $regsub, except two things: - the parameter wouldn't be treated as a command but as a value to replace certain substrings in the input string. - $regsub would use \1 instead of $1 These differences are minor details, both from the user's and the developer's perspective; I'm guessing that Khaled wouldn't have to work too hard to make this happen, since he did a similar thing with $findfile. Edit: too slow once again, starbucks summed it up as I was writing my post Last edited by qwerty; 06/03/05 08:59 PM. /.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com

Link Copied to Clipboard