mIRC Home    About    Download    Register    News    Help

Print Thread
#32902 29/06/03 09:35 AM
Joined: May 2003
Posts: 730
S
ScatMan Offline OP
Hoopy frood
OP Offline
Hoopy frood
S
Joined: May 2003
Posts: 730
//var %y,%x = $regsub(a-b-c-d*a,/([^-a-z]|\*[a-z]+?)/g,,%y) | echo -a %y
i'm trying to remove everything in the text except "-" and the letters between a-z
but if there is a * in front of the letters it will remove all the letters after it until the last(i used +? to match as less as possible), so my code should return a-b-c-d


#32903 29/06/03 12:34 PM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
This is happening because when $regsub reaches the * char, the subpattern that matches it is this: [^-a-z]
ie the first subpattern in the (). So, it removes the star and moves on. It won't remove the last "a" because the * char was dealt with in the previous run. Now, the only thing $regsub sees is an "a", which of course does not remove. This is the nature of consuming subpatterns, like (pattern), (?:pattern). Non-consuming subpatterns are the assertions: negative/positive lookahead/lookbehind assertions, the \b etc.

There are two ways to do what you want: the simplest one is to just change the order of the subpatterns:
//var %y,%x = $regsub(a-b-c-d*a,/(\*[a-z]+?|[^-a-z])/g,,%y) | echo -a %y

Another way would be to insert a positive lookbehind assertion in front of the second subpattern:
//var %y,%x = $regsub(a-b-c-d*a,/([^-a-z]|(?<=\*)[a-z]+)/g,,%y) | echo -a %y


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
#32904 29/06/03 12:50 PM
Joined: May 2003
Posts: 730
S
ScatMan Offline OP
Hoopy frood
OP Offline
Hoopy frood
S
Joined: May 2003
Posts: 730
frown
i don't understand it
why when u change the order it works ? what's the problem with mine ?
i read what u said and still couldn't understand the problem with the \*

#32905 29/06/03 04:50 PM
Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
Whenever you want to include the '-' character in a character class, it MUST be the last character. I'm not saying this is the only problem in your expression, it's just the first thing that caught my attention.

[^-a-z] bad
[^a-z-] good

- Raccoon


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
#32906 29/06/03 05:00 PM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
man.txt:
The minus (hyphen) character can be used to specify a range
of characters in a character class. For example, [d-m]
matches any letter between d and m, inclusive. If a minus
character is required in a class, it must be escaped with a
backslash or appear in a position where it cannot be inter-
preted as indicating a range, typically as the first or last
character in the class.


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
#32907 29/06/03 06:38 PM
Joined: May 2003
Posts: 730
S
ScatMan Offline OP
Hoopy frood
OP Offline
Hoopy frood
S
Joined: May 2003
Posts: 730
plz tell me what's the problem i can't figure it out

#32908 30/06/03 08:36 AM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
I don't think I can do any better than my first post...

Maybe it hasn't been clear to you that PCRE parses both the input string and the regex pattern from left to right. As a result of that behaviour, all characters up to and including the "*" (ie the red characters in the string: a-b-c-d*a) are matched by the first subpattern [^-a-z] repeatedly (because of the //g). So, the second subpattern \*[a-z] never gets the chance to match "*a", because "*" was consumed in the previous run by [^-a-z].


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
#32909 30/06/03 09:43 AM
Joined: May 2003
Posts: 730
S
ScatMan Offline OP
Hoopy frood
OP Offline
Hoopy frood
S
Joined: May 2003
Posts: 730
!
thanks alot man, i get it! laugh


Link Copied to Clipboard