mIRC Home    About    Download    Register    News    Help

Print Thread
#130861 23/09/05 01:39 AM
Joined: Aug 2005
Posts: 21
A
Ameglian cow
OP Offline
Ameglian cow
A
Joined: Aug 2005
Posts: 21
Hey im kinda new to mirc scripting and i need help with this

Code:
($regex(%data,/^(.+)(?:\(.+\))?\((\d+)\): (.+)$/U))  


can somebody explain me in simple details what this regex means because looks jiberish to me i also read plenty of tutorials about it and still dont understand it thanks in advance

#130862 23/09/05 06:42 AM
Joined: Sep 2005
Posts: 2,881
H
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,881
Since there are a few complex concepts in that expression my best advice to you is to read a decent tutorial. http://www.regular-expressions.info has a good one.

#130863 23/09/05 08:47 AM
Joined: Jul 2003
Posts: 655
Fjord artisan
Offline
Fjord artisan
Joined: Jul 2003
Posts: 655
The ^ and $ to the start and end of the expression. They match the start and end of the string the expression is applied to respectively.

(.+) - The period/dot represents any 1 charactor, the plus sign is a quantifier which basically matches any number of repetitions of the previous charactor. In this case a period is the previos charactor, so it will match 1 or more of any combination of charactors.

Basic outcome: string of one or more in length. eg abcd or ac17e etc

(?:\(.+\))? - This contains a few different things, firstly the ?: (question mark followed by semi colon) tells the regex engine not to back reference that section (backreferencing can be used to retrieve or substitute the specific string that portion of the regular expression matched). The question mark outside of the braces tells the engine that this portion of the expression is optional, so it doesn't need to be matched, but will be if it exists. The \ (backslash) in a regular expression is an escape charactor, since the () braces are special charactors in the regex syntax, escaping them mean you literally want to match a ( and ) charactor.

Basic outcome, string of one of more in length surrounded by circular braces. eg (abcd) or (ac17e) etc

\((\d+)\): - Again we see the use of escapes, we are now introduced to a \d, the \d represents a single digit. Because this is followed immediately by a plus sign, as described above this now matched any number of repetitions of \d. The : (semi colon) after the final brace is not a special charactor.

Basic outcome, a number of any length, surrounded by circular braces, followed by a semi colon. eg (1234): or (19733534): etc.

The use of braces around non-optional portions as seen in this expression are used to create backreferences, they also allow you to apply regex operators to the entire grouped regex. These can be used at a later stage.

The final (.+) is the same as the one at the start, however it matched at the very of of the string.

The / / around the entire expressions are mirc specific indicators used to specificy the start and end of the regular expression so that it does not misinterpret special charactors as mirc code. The U after the closing / is a modifier/switch that tells it to be ungreedy.

Note: as you can see there is a use of a plain space in this regular expression, this is infact very bad practive and invalid, if you want to match a space in string you must use \s.

This regular expression basically matches something like ?*(?*)(#*): ?* (where the # is a digit and the (?*) is optional). eg 'a(bc)(123): def' or 'a(8897): bfg' etc

I think i read that right, no checking or testing was done to be sure.

Hope this helped a little, but as suggested regular-expressions.info has some great beginners resource/tutorial stuff to help you understand regex better.


"Allen is having a small problem and needs help adjusting his attitude" - Flutterby
#130864 23/09/05 09:34 PM
Joined: Apr 2003
Posts: 701
K
Hoopy frood
Offline
Hoopy frood
K
Joined: Apr 2003
Posts: 701
Read http://pcre.org/pcre.txt is the absolute reference guide to regex for mIRC. PCRE is the package that's built into mIRC. Just skip the first 1470 or so lines, starting at PCRE REGULAR EXPRESSION DETAILS.

As for that specific regex:

/^ <= matches only begin of string
(.+) <= anything or even nothing, and remember it in $regml(1)
(?:\(.+\))? <= anything between brackets ( ) but maybe just nothing, no brackets either
\((\d+)\) <= one or more digits 0-9 between brackets, remember the digits in $regml(2)
: <= a colon followed by a space
(.+) <= anything again, or nothing, and remember that in $regml(3)
$ <= the end of the string, so that the previous contains anything right until the end of the input
/U <= Ungreedy, if you have the choice between something or something more, choose something smile

#130865 24/09/05 02:11 AM
Joined: Jul 2003
Posts: 655
Fjord artisan
Offline
Fjord artisan
Joined: Jul 2003
Posts: 655
A period represents any single charactor, the plus represents one or more repetitions. Therefor (.+) must match at least 1 charactor (even if its a space etc), before the regex engine will move on. (.*) would be anything or nothing.


"Allen is having a small problem and needs help adjusting his attitude" - Flutterby

Link Copied to Clipboard