mIRC Homepage
Posted By: nycdiesel Regex Help - 20/10/05 06:48 AM
How would i extract words beginning with an $ in a line like so:
Code:
.......$blah.......$blah..


Thanks.
Posted By: Rand Re: Regex Help - 20/10/05 07:22 AM
This should probably work for you (though you may need to tweak it for your own use):
Code:
alias blahtest {
  var %text = $eval($one $two $five $four $three $six $nine $eight $seven $ten just a little $blah test for you to $blah see,0) , %i = 1
  if ($regex(testname,%text,/(\$[^ ]+)/g)) {
    while ($regml(testname,%i)) {
      echo -stg $v1 | inc %i
    }
  }
}


If the lines really contain a bunch of .......'s

You would have to use: if ($regex(testname,%text,/(\$[^\.]+)/g)) {
Posted By: nycdiesel Re: Regex Help - 20/10/05 07:40 AM
Thanks, works great.
Posted By: FiberOPtics Re: Regex Help - 20/10/05 02:18 PM
Two small tips:

* You can replace [^ ] with \S.

\S means match a non-whitespace character, and will work slightly faster, however neglegible.

* A dot does not need escaping if it's in a character class as it loses its meta meaning within it.
Posted By: Rand Re: Regex Help - 20/10/05 02:25 PM
Thanks.

I had actually just started playing with Perl not all that long ago, which got me slightly into $regex(), though nothing super complicated yet. A few days later I realised mIRC had $regex(), so after messing around making a few perl scripts, I started using mIRC's $regex() functions. shocked

Prior to me attempting to learn some Perl, I was clueless about regular expressions. So everything I had did in mIRC up to that point required $gettok(). Oh the horror.
Posted By: FiberOPtics Re: Regex Help - 20/10/05 02:28 PM
Amazing that an IRC chat client like mIRC has regex support isn't it? A beautiful tool, though with somewhat of a learning curve, atleast if you want to make optimized/complex regex patterns.

I remember looking at the /help $regex page and blinking as I was thinking: wtf is this stuff o.O grin
Posted By: Rand Re: Regex Help - 20/10/05 02:39 PM
Haha, yeah. I did that ages ago too. I saw it in the help file and since it didn't actually give any useful information on how to do it, I just ignored it and moved on (and completely forgot about it until I played with perl).

Fun stuff though, $regex() makes mIRC scripting a *lot* easier to deal with.
Posted By: FiberOPtics Re: Regex Help - 20/10/05 08:07 PM
I didn't think of it at the time, but you could use $regsub to delete any string that doesn't match our criteria. Something like:

Code:
alias blah {
  var %a 
  tokenize 32 $(,,$regsub($1-,/.*?((?<= |^)\$\S+ ?)|.*,\1,%a)) %a
  echo -a $*
}


Note to anyone testing this, obviously you must not use the double // slash when using it from the command line, or it will evaluate the identifiers that you pass it to. Similarly, if you issue it from within a script, the identifiers must be escaped, or again they will evalute.

/blah this $is a $test $to see if $this works

or

//blah this $!is a $!test $!to see if $!this works

Note that it does not catch the $rc in something like "mi$rc" since the original requester asked for "words starting with $", although I'm wondering what exactly he meant with those dots. Are those supposed to be real dots, or do they mean other words in between?
Posted By: Sigh Re: Regex Help - 21/10/05 02:49 AM
Double check that regex, it doesn't match what you want it to. I'd go for:

Code:
$regsub($1-,/(?:^| )[^$]\S*/g,,%a)
Posted By: FiberOPtics Re: Regex Help - 21/10/05 03:11 AM
Ah I made some edits in the code after posting it, and the last part of the regsub got cut off. This is what it really looked like:

alias blah {
var %a
tokenize 32 $(,,$regsub($1-,/.*?((?<= |^)\$\S+ ?)|.*/g,\1,%a)) %a
echo -a $*
}

As you can see, it could have never worked correctly, as the last regex delimiter / was cut off, whilst I did use the starting one.

Nevertheless I do prefer yours, it's simpler and somewhat more efficient in regex terms.

Too bad I can't edit the other post anymore. Oh well.
Posted By: Rand Re: Regex Help - 21/10/05 03:32 AM
I'm not actually familiar with $()'s or $*
Posted By: FiberOPtics Re: Regex Help - 21/10/05 03:41 AM
$() is the same as $eval() just shorter notation.

$() is used here in a special form, which was actually discovered by Sigh. If you specify code in the third parameter it will perform a noop call, as in it will let the parameter evaluate, without making it return a result.

Similar to:

//echo -a $null($regex(abc,/(.)/)) $regml(1)

which does the same as

//echo -a $(,,$regex(abc,/(.)/)) $regml(1)


Why $(,,<parameter>) and not $null?

Unlike $(), you can create a custom alias for $null which will override the default one.

alias null return lol

Making $null cease to be the noop construct that it is. (noop= no operation). With the $(,,) construct, or $time(,<parameter>) etc. you are 100% sure that it cannot be overriden by a custom alias.


Regarding the $*, it's an identifier that you will only find in the versions.txt. It will call the command that you pass it to for $0 tokens, each time for the next token $n

Example:

//tokenize 46 a.b.c | echo -a $*

Total amount of tokens = $0 = 3, so it calls the echo command 3 times.

First time: with $1 = a, so it echoes "a"
Second time: with $2 = b, so it echoes "b"
Third time: with $3 = c, so it echoes "c"

One of my favourite things that I tell people to do when teaching them about $* is this command:

//tokenize 46 $str(lol.,200) | echo -a $* grin

Note that you can only pass $* to commands, not identifiers, so doing something like:

//tokenize 32 1 2 3 | echo -a $gettok(one two three four,$*,32)

will not work, as it will echo 3 times "4". That's because the $* evaluated to `~$*, which is not a valid char in the second parameter of gettok. The result of this is that mIRC parses it as being "0" which simply returns the total amount of tokens. In other words, if $* touches ANY text, it no longer acts as it usually does, but is transformed to the value `~$*, which has no meaning. Even if the $* didn't touch anything, it still would have echoed the same thing. Tha'ts because the $gettok evaluated right now, before $* called the echo command for $0 times.

There is a "hack" that you _could_ use to pass $* to identifiers, although what you're doing in reality is still passing it to a command, like this:

//tokenize 32 1 2 3 | scon -r echo -a $!gettok(one two three four, $* ,32)

What you see here is that I prevent the $gettok from evauating now (or it would be 4 again), but I need a way to let it only evaluated when the $* is passing it's parameters to the command (in our case echo -a). For that we use scon, which will evaluate parameters twice, once when passign them, and once when scon has reached the target connection. So instead of echo being called 3 times, it's scon that's being called 3 times now, but having as parameters a command to echo something. When the scon reaches its target connection, it performs the passed command, in our case:

First time it calls scon: echo -a $gettok(one two three four, 1 ,32) --> echoe's one
Second time it calls scon: echo -a $gettok(one two three four, 2 , 32) --> echoes two
...


Be warned!! Passing code to scon is dangerous, and should generally be avoided. Nowadays I only use the scon -r trick if I'm positive that the parameters that are passed to it, will never be a problem with the double evaluation that is accompanied by using scon/scid/timer etc. This double evaluation is the reason that I escaped the $gettok by putting a ! right after the $, so that it is not evaluated now (when setting the command), but when scon reaches its destination.

Btw, $* is not documented for a good reason, it's a very special kind of identifier, and it's quirky. Check out this example:

//tokenize 32 1 2 3 | echo -a $* | echo -a $*

It only echoed 1, 2, 3 once (on seperate lines) whereas I told it to do it twice. To illustrate better what happens:

//tokenize 32 1 2 3 | echo -a $* | tokenize 32 a b c 1 2 3 | echo -a $*

It echoed 1,2, 3 two times now, ignoring the "a b c" part that I put in the second tokenize command. This is because $*'s internal counter isn't reset when issuing it. This probably doesn't make sense yet, but play with it a little and you will see what the bug exactly is. Basically when $* is called for let's say 4 tokens, the next time you tokenize and issue $* it starts counting from token 5 instead of going back to token 1 and beginning from the start. So after our first tokenize, the counter was at 3, however after the second tokenize it starts at 4, which is the 1 again as that was the fourth space delimited token in the string "a b c 1 2 3". Then it moves to 5 and 6, echoing 2 and 3 respectively.

EDIT: It's 6 am, I'm pretty tired, and I sincerely hope I didn't accidentally edit out some parts again as a few posts ago :tongue:
Posted By: Rand Re: Regex Help - 21/10/05 03:49 AM
ah, nice.

Interesting technique.
© mIRC Discussion Forums