mIRC Home    About    Download    Register    News    Help

Print Thread
#21953 02/05/03 12:16 PM
Joined: Dec 2002
Posts: 124
B
bloupx Offline OP
Vogon poet
OP Offline
Vogon poet
B
Joined: Dec 2002
Posts: 124
okay here's what i want to do , connecting to a website to retrieve info and remove all the html crap. example:
var %r = <li><a href="http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc1459.html"><b>RFC1459 </b></a></li>
var %d,%s = $regsub(%r,/(<[^>]+>)/g,$chr(32),%d)

in the example above i need the url, but the $regsub removes it, how would i fix it.

for anywhere else my $regsub there works fine.

#21954 02/05/03 12:49 PM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
Try this:
Code:
var %d, %s = $regsub(%r,/^.*?&lt;a href="(.+?)".*/,\1,%d)


or this:
Code:
var %d = $regex(%r,/&lt;a href="(.+?)"/), %d = $regml(1)

The captured subpatterns are represented by \1, \2, \3 etc.


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
#21955 02/05/03 01:49 PM
Joined: Dec 2002
Posts: 124
B
bloupx Offline OP
Vogon poet
OP Offline
Vogon poet
B
Joined: Dec 2002
Posts: 124
yea that works for the example above , but it shall work with the other bits too where it shall just remove all html code frown

thanks for your effort.

#21956 02/05/03 02:39 PM
Joined: Dec 2002
Posts: 1,922
O
Hoopy frood
Offline
Hoopy frood
O
Joined: Dec 2002
Posts: 1,922
BTW, what's the role of the ?'s in the following pattern?

$regsub(%r,/^.*?<a href="(.+?)".*/,\1,%d)

If it means "allow .* (or .+) zero or one time", it might be unnecessary for this pattern.

Last edited by Online; 02/05/03 02:48 PM.
#21957 02/05/03 03:45 PM
Joined: Dec 2002
Posts: 699
N
Fjord artisan
Offline
Fjord artisan
N
Joined: Dec 2002
Posts: 699
It makes the pattern "ungreedy", where it takes the minimum number of characters to match the pattern.

#21958 03/05/03 10:50 AM
Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
As Nimue said, the "?" makes the quantifier ungreedy. If you want more info, PCRE's man.txt explains what greedy/ungreedy is around line 1500.


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com
#21959 03/05/03 05:17 PM
Joined: Dec 2002
Posts: 1,922
O
Hoopy frood
Offline
Hoopy frood
O
Joined: Dec 2002
Posts: 1,922
Oh, thats very clever indeed smile thank you for the clarification.


Link Copied to Clipboard