mIRC Homepage
Posted By: pouncer parsing - 12/02/07 09:51 PM
This is the html stuff in the page i want to grab info from:

td class=StyleTahoma><a href='chatroom.php?rm=helpdesk'>helpdesk</a><br><a href='chatroom.php?rm=irchat'>ircchat</a><br><a href='chatroom.php?rm=lebs'>lebs</a><br><a href='chatroom.php?rm=domcat'>domcat</a><br></td></tr></table>

it tells me im in 4 rooms, helpdesk,irchat,lebs,domcat

could someone show me how i can extract this info from the string?
Thanks.
Posted By: Kurdish_Assass1n Re: parsing - 12/02/07 10:01 PM
/help $remove I would do it for you, but, I don't know if it's always like that code above, meaning the exact characters. I'd bet regex would be able to do, but, I don't know what it is or how to use it.
Posted By: pouncer Re: parsing - 12/02/07 11:02 PM
yea i thought regex too, but im not sure myself.

hope there is someone else out there who can help
Posted By: Riamus2 Re: parsing - 12/02/07 11:19 PM
You can use the $htmlfree alias. There are a few different ones out there, but this is the first one I found with searching the forum:

Code:
alias -l htmlfree {
  var %x, %i = $regsub($1-,/(^[^<]*>|<[^>]*>|<[^>]*$)/g,$chr(32),%x), %x = $remove(%x,&nbsp;)
  return %x
}


Then, you use $htmlfree(%variable) and it will remove all HTML coding. Note that the $htmlfree alias works from within the script it's included with and not anywhere else. If you want it to work from the command line or in other scripts, then remove the -l from the first line.

That's a slightly modified version that will insert a space between each word. Normally $chr(32) is $null for this sort of alias.
Posted By: astigmatik Re: parsing - 14/02/07 05:37 AM
DISCLAIMER: I have no mirc here, and I'm not a regex expert, but this might just work:

Code:
alias extract_channels {
  ; The only reason I'm separating the expression is to make it clearer
  var %r = <.*?>(.*?)<\/.*?>
  noop $regex($1-, %r)
  var %x = 1
  while ($regml(%x) != $null) {
    %result = $addtok(%result, $v1, 44)
    inc %x
  }
  return %result
}
Posted By: b1ink Re: parsing - 14/02/07 10:17 AM
You can use the htmlsrtip thingy and a parser. something like this:
Code:
alias htmlchans {
  var %a = $1
  noop $regsub(%a,/(?:^[^<]*?>|<[^>]*?>|<[^>]*$)/g,.,%a)
  noop $regsub(%a,/(?:^\.+|(?<=\.)\.+|\.+$)/g,$null,%a)
  return $iif($2,$gettok(%a,$2,46),$numtok(%a,46) - %a)
}

$htmlchan(string[,N])
N refers to the Nth channel. (it's optional)
© mIRC Discussion Forums