mIRC Homepage
Posted By: bloupx $regsub - 02/05/03 12:16 PM
okay here's what i want to do , connecting to a website to retrieve info and remove all the html crap. example:
var %r = <li><a href="http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc1459.html"><b>RFC1459 </b></a></li>
var %d,%s = $regsub(%r,/(<[^>]+>)/g,$chr(32),%d)

in the example above i need the url, but the $regsub removes it, how would i fix it.

for anywhere else my $regsub there works fine.
Posted By: qwerty Re: $regsub - 02/05/03 12:49 PM
Try this:
Code:
var %d, %s = $regsub(%r,/^.*?&lt;a href="(.+?)".*/,\1,%d)


or this:
Code:
var %d = $regex(%r,/&lt;a href="(.+?)"/), %d = $regml(1)

The captured subpatterns are represented by \1, \2, \3 etc.
Posted By: bloupx Re: $regsub - 02/05/03 01:49 PM
yea that works for the example above , but it shall work with the other bits too where it shall just remove all html code frown

thanks for your effort.
Posted By: Online Re: $regsub - 02/05/03 02:39 PM
BTW, what's the role of the ?'s in the following pattern?

$regsub(%r,/^.*?<a href="(.+?)".*/,\1,%d)

If it means "allow .* (or .+) zero or one time", it might be unnecessary for this pattern.
Posted By: Nimue Re: $regsub - 02/05/03 03:45 PM
It makes the pattern "ungreedy", where it takes the minimum number of characters to match the pattern.
Posted By: qwerty Re: $regsub - 03/05/03 10:50 AM
As Nimue said, the "?" makes the quantifier ungreedy. If you want more info, PCRE's man.txt explains what greedy/ungreedy is around line 1500.
Posted By: Online Re: $regsub - 03/05/03 05:17 PM
Oh, thats very clever indeed smile thank you for the clarification.
© mIRC Discussion Forums