|
Joined: Jul 2005
Posts: 56
Babel fish
|
OP
Babel fish
Joined: Jul 2005
Posts: 56 |
hi ppl, am new at regex and am havin' a big problem. i got this html code (August 15<SUP>th</SUP> 2005) and i want it to be like this (August 15th 2005) anyone could help?
|
|
|
|
Joined: Feb 2004
Posts: 2,019
Hoopy frood
|
Hoopy frood
Joined: Feb 2004
Posts: 2,019 |
//var %a = (August 15<SUP>th</SUP> 2005) | echo -a $(,,$regsub(%a,/<.+?>/g,,%a)) %a
Gone.
|
|
|
|
Joined: Apr 2004
Posts: 759
Hoopy frood
|
Hoopy frood
Joined: Apr 2004
Posts: 759 |
could just use replace/remove for something easy like that. //var %text = (August 15<SUP>th</SUP> 2005) | $replace(%text,<SUP>,,</SUP>,) or //var %text = (August 15<SUP>th</SUP> 2005) | $remove(%text,<SUP>,</SUP>) or you can use this:
alias htmlstrip { .echo -q $regsub($1-,/(<.*?>).*?(<*?>)/g,,%x) | return %x }
To remove html tags from a string i.e $htmlstrip(%text) EDIT: Beat to it by FO as always *sighs* :P
$maybe
|
|
|
|
Joined: Jul 2005
Posts: 56
Babel fish
|
OP
Babel fish
Joined: Jul 2005
Posts: 56 |
sorry ppl but i know how to do it with regsub alias htmlstrip { .echo -q $regsub($1-,/<[^<>]+>/g,,%x) | return %x } but i want to know if there is a way by usin' $regex
|
|
|
|
Joined: Apr 2003
Posts: 701
Hoopy frood
|
Hoopy frood
Joined: Apr 2003
Posts: 701 |
Here's a way without $regsub, but it suffers from the standard problems with spaces, so you should really put a $replace($1-,$chr(32),ΓΈ) at the first /var and the reverse on the return... alias reremove {
var %t = $1-
while ($regex(%t,(^[^<]*)<[^>]+>(.*+)$)) var %t = $regml(1) $+ $regml(2)
return %t
}
echo -s $reremove(August 15<SUP>th</SUP> 2005)
|
|
|
|
Joined: Jul 2005
Posts: 56
Babel fish
|
OP
Babel fish
Joined: Jul 2005
Posts: 56 |
regex in ur blood Kelder thanks for helpin' me out i've made a small socket thingy shows u latest mirc.com news usin' regular expression alias news sockclose news | sockopen mirc mirc.com 80 on *:sockopen:mirc:{ sockwrite -n $sockname GET / HTTP/1.0 | sockwrite -n $sockname Host: mirc.com $+ $str($crlf,2) | echo -a * mIRC News: } on *:sockread:mirc:{ sockread %temp if ($regex(mirc,%temp,/<B>(.+)</B>\C+">(\Q(\E.+\Q)\E)/)) { var %a = $regml(mirc,1) $regml(mirc,2), %b = $regsub(%a,/<[^<>]+>/g,$null,%a) | echo -a %a } }
|
|
|
|
Joined: Feb 2004
Posts: 2,019
Hoopy frood
|
Hoopy frood
Joined: Feb 2004
Posts: 2,019 |
You should always add error checking when working with sockets, especially in the sockopen event. The sockopen event will trigger, even if the connection failed, so you should check for $sockerr there.
That regex is very excessive and not necessary, the following will do fine, if you change your get to:
GET /get.html HTTP/1.0
if ($regex(mirc,%temp,/Download mIRC (\S+) or/)) { echo -a Version: $regml(mirc,1) sockclose $sockname return }
As you can see, I do a socklose after a match has been found, as it's no longer necessary to retrieve the rest of the data.
Btw, why not use a local variable %temp instead of a global one? Right now your script leaves a trailing %temp global var.
Furthermore, it would be a very good idea to create a while loop checking for $sockbr, and to read more from the buffer in the same event, than letting the on sockread even trigger. Letting the event trigger is slower.
Gone.
|
|
|
|
Joined: Jul 2005
Posts: 56
Babel fish
|
OP
Babel fish
Joined: Jul 2005
Posts: 56 |
thanks for ur advices fop it really helped me alot oh btw the socket snippet u just saw isnt completed yet. and it searchs for mirc.com latest news. anything ends with a date between brackets (August 15th 2005)
Last edited by whoami; 24/11/05 06:27 PM.
|
|
|
|
|