mIRC Homepage
Posted By: Darwin_Koala A request for Reg_Ex help/explanation - 19/11/06 01:58 AM
I am trying to write an alias to extract a string of 8 hex characters. I have tried some various combinations, not of which seem to work. Eventually I came up with this string:
Code:
   
if $regex(hex_code, $$1, /([0-9a-fA-F]{8})/)  { return $regml(hex_code,1) }
 


But the following examples did not give the results I expected:
Code:
 
alias testprce {
  ;var %localstring = $regex(hex_code, $$1, /(\d{3})/)
  ;var %localstring = $regex(hex_code, $$1, /([:xdigit:]{3})/)
  ;var %localstring = $regex($$1, /[:xdigit:]/)
  echo 14 %AllUsers_CustomWindow (Reg Ex test): $$1, %localstring $regml(hex_code,1)
}
 


test command used was: /testprce ~d829879a

I have had some interesting results; but not what I expected.
The first results in "1". I find this surprising, because I thought that it would result in "4" (829, 298, 987, 879).
The 2nd and 3rd examples result in "0" as well.

I have tried reading the PRCE guide (extracted manual entry), but I think I am confusing myself somewhere down the line.

This is my first real foray into regex, I have done a quick search of the forum here, but the examples I came up with were too complicated for me (at this stage).

Can someone please explain the (strange to me) results?

Thanks,

DK
Posted By: jaytea Re: A request for Reg_Ex help/explanation - 19/11/06 02:34 AM
Code:
$regex(hex_code, $$1, /(\d{3})/)


this will never return anything other than 0 or 1 since you haven't used the //g modifier i.e. /(\d{3})/g (which makes it continue past a first successful match) but that still wouldn't work as you expect. the way a regular expression is matched against the string, it consumes the string as it moves through it. so let's suppose the regex matching point is at the start of "829879". after \d{3} is matched, the matching point is after "829" and during the next cycle.. it starts where it left off and attempts to match \d{3} here in the middle: 829|879

if you want it to match these overlapping occurences of numbers, you'll need to progress through the series of matches one character at a time, but at the same time check that there exists 3 numbers in a row without the matching point progressing past them. this can be done with a zero width assertion:

Code:
$regex(hex_code, $$1, /(?=(\d{3}))/g)


(?=) takes a look ahead to see if \d{3} is matched, without actually consuming those characters. since (?=) is zerowidth, that expression actually matches a position in the string, before any group of 3 numbers

the problem with the other two is quite simply that [:xdigit:] itself belongs inside a character class, [[:xdigit:]] allowing you to do such things as [A-Z[:xdigit:]a-z] etc.
Thanks for that, I had forgotten [that I had read] that regex consumes the string. I had thought that the "global" element was by default - I understand that it is "greedy" by default as well - but my match didn't allow for greediness.

I'll just have to practice more!

Cheers,

DK.
© mIRC Discussion Forums