mIRC Home    About    Download    Register    News    Help

Print Thread
#132573 12/10/05 10:59 AM
Joined: Jul 2005
Posts: 56
W
whoami Offline OP
Babel fish
OP Offline
Babel fish
W
Joined: Jul 2005
Posts: 56
Hi. I've just learned regular expression and while i was searching for some codes i can study them. i found this:

alias backsp {
var %m = $1
while ($regsub(%m,(^|.)\ $+ $chr($2),,%m)) !
return %m
}


what i've been trying to do is this:

alias underline {
%a = scripting
.echo -q $regsub(%a,([^gp]),\1,%a)
return %a
}


crazy who can explain me how i can use loop througt regsub!

#132574 12/10/05 11:25 AM
Joined: Sep 2003
Posts: 4,230
D
Hoopy frood
Offline
Hoopy frood
D
Joined: Sep 2003
Posts: 4,230
The $regsub indentifer comes back evalaueted as the number of substitutions it made in the first %m, and stores the resulting string into %m again, so as long as it made at least one sub the result well be NON zero, which when used on its own a while loop makes that loop true, so it well loop back around and do the $regsub again on the new %m text.

*** HOWEVER I dont believe the first code works at least looking at it here i think the $regsub line should look like this while ($regsub(%m,(^|.)\ $+ $chr($2),,%m)) { } , nothing is needed in the while loop ie { }, as everything is done in the $regsub

I must say im not any good with regsubs, but for the most part i thought they usally didnt require repeditive processing, the regsub can be instructed to replace any number of occrances, however the original example may have been something to do with one pass creating newly substitueable sections of the text, which i guess would need multiple passes untell no substitutions occur.

Its not actually looping

#132575 12/10/05 12:25 PM
Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
Before anything else, you should first state what you're trying to do.

In addition, I get the impression that you haven't learned regex enough to be able to grasp any explanation at this point.

Your first code seems to want to remove a certain char specified by it's numeric representation, and its preceeding char. $backsp(this is a test,$asc(i)) returns tss a test. You don't need a while loop for that, you can use the /g modifier that will make the regex engine repeat the pattern on the string until all matches have been made.

/*
Usage: $backsp(string,character)

Example: //echo -a $backsp(this is a test,s)

*/

alias backsp {
var %a, %b = $regsub($1,/(?:^|.)\ $+ $base($asc($2),10,8) /gx,,%a)
return %a
}

Note that I used the x modifier, which will ignore unescaped whitespace in the expression, and I made the first pair of brackets uncapturing, since you're not referencing them anywhere. Also note that a lot of characters have a meaning in a regex. I've changed the regsub so that you can input an actual character instead of it's ascii numeric representation. However, with that there is a danger accompanied. Since we have that \ in the expression, for example if we would have used the character "S" as input, it would turn into \S which means "match a non whitespace character". To avoid this I use the octal representation of any character you input, making sure there will be no conflicts with built-in regex constructs. In theory there could be a conflict with backreferences, but since we're not capturing anything, this will form no problem.

Your second code seems to want to underline any character that is neither a "g" or a "p". It does however only do one substitution because the regex engine by nature is lazy, unless you force it not to be. If you want to make the regex engine keep doing substitutions, specify the /g modifier, like was done in the backsp code.

Note that some day you are going to run into trouble because you're not using the regex delimiters / / to enclose your regex patterns. This is especially true if you use regex in events when specifying the $ event prefix, I've had patterns not work due to lacking / /, even in the case where the expression didn't start with an "m". If you know regex, you'll know what I mean with that.

The modifiers come after the second regex delimiter /, or you can specify them by putting them inside brackets like this (?<modifiers>)

I think you need to do some more reading and practicing, here's a good tutorial

Here's also a link to the main reference for anything regarding PCRE, which is the regex library that mIRC also uses: pcre.txt, although I don't recommend it at first as it's somewhat hard to read through. I'd go with the tutorial first.


Gone.
#132576 12/10/05 02:01 PM
Joined: Apr 2004
Posts: 871
Sat Offline
Hoopy frood
Offline
Hoopy frood
Joined: Apr 2004
Posts: 871
Quote:
Your first code seems to want to remove a certain char specified by it's numeric representation, and its preceeding char. $backsp(this is a test,$asc(i)) returns tss a test. You don't need a while loop for that, you can use the /g modifier that will make the regex engine repeat the pattern on the string until all matches have been made.

You do need a while loop, as several backspace characters can be next to eachother..


Saturn, QuakeNet staff
#132577 12/10/05 02:30 PM
Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
That depends on what you're after of course.

If the char that's passed to the regsub ($2) is an i, then on the string "thiis", one way to look at it, is that we have a "hi" part, and a "ii" part, which is two times the char to be removed with its preceeding char. One could say the output of that should then be "ts". If that's the case, then putting a simple + after the $base code, will take care of that, no while loop required.

The code for this approach:

alias backsp {
var %a, %b = $regsub($1,/(?:^|.)\ $+ $base($asc($2),10,8) +/gx,,%a)
return %a
}

On the other hand one could look at it and say:

Let's first remove "hi", which leaves "tis", and only then remove the "ti", leaving nothing but "s". For that, one will indeed need a loop.

The code for this approach:

alias _backsp {
var %a = $1, %re = /(?:^|.)\ $+ $base($asc($2),10,8) $+ /
while ($regsub(%a,%re,,%a)) !
return %a
}


Gone.
#132578 12/10/05 02:34 PM
Joined: Apr 2004
Posts: 871
Sat Offline
Hoopy frood
Offline
Hoopy frood
Joined: Apr 2004
Posts: 871
Well, in general any backspace functionality follows the second method :tongue:


Saturn, QuakeNet staff
#132579 12/10/05 02:47 PM
Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
I have no idea what the "general" thing is, but I believe ya smile


Gone.
#132580 12/10/05 04:26 PM
Joined: Apr 2003
Posts: 701
K
Hoopy frood
Offline
Hoopy frood
K
Joined: Apr 2003
Posts: 701
Here's a way to do it without a loop...
Ofcourse replace i with \ooo where ooo is the octal character code of your choice. \xhh with hh the hexadecimal character code is ok too...
Combining it into one regex /^i+|.(?1)*i)/g can fail in situations like
"aii", "aiii" works though.

Code:
alias backspace {
  var %res, %q = $regsub($1-,/(.(?1)*i)/g,,%res) $regsub(%res,/^i+/,,%res)
  return %res
}

#132581 12/10/05 04:47 PM
Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
I should start using that recursive feature more often, looks really handy!


Gone.
#132582 12/10/05 04:50 PM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
Actually there is a way to avoid the loop, although it's worse than looping. I just mention it for the 'fun' of it and because some interesting things happen when you feed this with a lot of backspaces.

First of all, it gets really slow after some point, for example:

//echo -s $_backsp(ABCDE $+ $str(a,16) $+ $str(i,18),i)

it echoes "ABC" after a few seconds here. Now the weird part. If you type the above command and then type this:
//echo -s $_backsp(ABCDEFG $+ $str(a,26) $+ $str(i,28),i)
it takes a very long time, which is normal, but doesn't echo the correct result (ABCDE), neither $null: it echoes the previous answer (ABC). PCRE has issues with recursion and there's even a pattern with (?R), which I won't mention here, that crashes mirc, even though the crashes are supposed to have been fixed (not sure on which part, PCRE or mirc, the point is that if a recursive pattern gets out of hand, normally $regsub() doesn't crash mirc anymore. Instead it returns the original string unaffected, as if it didn't match anything).

Anyway, here's the code:

Code:
alias _backsp {
  var %c = \ $+ $base($asc($2),10,8) 
  !.echo -q $regsub($1,/[^ $+ %c $+ ](?R)+ %c |/gx,,%a) $regsub(%a,/^ %c +/x,,%a)
  return %a
}


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
#132583 12/10/05 04:53 PM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
Damn you sneaky people! I posted a good 25 minutes after your post... who's gonna believe me now? :tongue:

It's weird how we came up with exactly the same solution tho... there must be only one smile

#132584 12/10/05 05:11 PM
Joined: Apr 2004
Posts: 871
Sat Offline
Hoopy frood
Offline
Hoopy frood
Joined: Apr 2004
Posts: 871
Woah, I didn't even know that something like that existed! Thanks Kelder and qwerty, I have something new to check out now smile

However, a quick glance at the concerned section of the PCRE documentation seems to suggest that the solution to the slowness is atomic grouping. In fact, the following variant of qwerty's regex seems to be fast enough to be usable, and produce the expected result for the second example as well:

Code:
alias _backsp {
  var %c = \ $+ $base($asc($2),10,8) 
  !.echo -q $regsub($1,/[^ $+ %c $+ ](?R)+[color:red]+[/color] %c |/gx,,%a) $regsub(%a,/^ %c +/x,,%a)
  return %a
}

On the other hand, applying the same trick to Kelder's regex results in horribly wrong answers, so I really can't say whether it's fully correct in any case. Thoughts?

(with apologies to whoami for taking this more and more off-topic)

Last edited by Sat; 12/10/05 05:23 PM.

Saturn, QuakeNet staff
#132585 12/10/05 05:54 PM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
Ah yes, atomic grouping seems to help indeed! One can never read pcre.txt enough smile I'll test more but so far it works like a charm.

The reason Kelder's doesn't work is that he uses . instead of [^i] just before the (?1) call.


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
#132586 13/10/05 01:15 PM
Joined: Jul 2005
Posts: 56
W
whoami Offline OP
Babel fish
OP Offline
Babel fish
W
Joined: Jul 2005
Posts: 56
well thanks for helping. grin


Link Copied to Clipboard