mIRC Home    About    Download    Register    News    Help

Print Thread
#170686 12/02/07 09:51 PM
P
pouncer
pouncer
P
This is the html stuff in the page i want to grab info from:

td class=StyleTahoma><a href='chatroom.php?rm=helpdesk'>helpdesk</a><br><a href='chatroom.php?rm=irchat'>ircchat</a><br><a href='chatroom.php?rm=lebs'>lebs</a><br><a href='chatroom.php?rm=domcat'>domcat</a><br></td></tr></table>

it tells me im in 4 rooms, helpdesk,irchat,lebs,domcat

could someone show me how i can extract this info from the string?
Thanks.

#170689 12/02/07 10:01 PM
Joined: Apr 2006
Posts: 399
K
Fjord artisan
Offline
Fjord artisan
K
Joined: Apr 2006
Posts: 399
/help $remove I would do it for you, but, I don't know if it's always like that code above, meaning the exact characters. I'd bet regex would be able to do, but, I don't know what it is or how to use it.

Last edited by Kurdish_Assass1n; 12/02/07 10:01 PM.
P
pouncer
pouncer
P
yea i thought regex too, but im not sure myself.

hope there is someone else out there who can help

#170693 12/02/07 11:19 PM
Joined: Oct 2004
Posts: 8,061
R
Hoopy frood
Offline
Hoopy frood
R
Joined: Oct 2004
Posts: 8,061
You can use the $htmlfree alias. There are a few different ones out there, but this is the first one I found with searching the forum:

Code:
alias -l htmlfree {
  var %x, %i = $regsub($1-,/(^[^<]*>|<[^>]*>|<[^>]*$)/g,$chr(32),%x), %x = $remove(%x,&nbsp;)
  return %x
}


Then, you use $htmlfree(%variable) and it will remove all HTML coding. Note that the $htmlfree alias works from within the script it's included with and not anywhere else. If you want it to work from the command line or in other scripts, then remove the -l from the first line.

That's a slightly modified version that will insert a space between each word. Normally $chr(32) is $null for this sort of alias.

#170769 14/02/07 05:37 AM
A
astigmatik
astigmatik
A
DISCLAIMER: I have no mirc here, and I'm not a regex expert, but this might just work:

Code:
alias extract_channels {
  ; The only reason I'm separating the expression is to make it clearer
  var %r = <.*?>(.*?)<\/.*?>
  noop $regex($1-, %r)
  var %x = 1
  while ($regml(%x) != $null) {
    %result = $addtok(%result, $v1, 44)
    inc %x
  }
  return %result
}

#170774 14/02/07 10:17 AM
Joined: Oct 2006
Posts: 166
B
Vogon poet
Offline
Vogon poet
B
Joined: Oct 2006
Posts: 166
You can use the htmlsrtip thingy and a parser. something like this:
Code:
alias htmlchans {
  var %a = $1
  noop $regsub(%a,/(?:^[^<]*?>|<[^>]*?>|<[^>]*$)/g,.,%a)
  noop $regsub(%a,/(?:^\.+|(?<=\.)\.+|\.+$)/g,$null,%a)
  return $iif($2,$gettok(%a,$2,46),$numtok(%a,46) - %a)
}

$htmlchan(string[,N])
N refers to the Nth channel. (it's optional)


Link Copied to Clipboard