mIRC Home    About    Download    Register    News    Help

Print Thread
#7234 18/01/03 07:31 PM
Joined: Jan 2003
Posts: 41
T
Thedude Offline OP
Ameglian cow
OP Offline
Ameglian cow
T
Joined: Jan 2003
Posts: 41
I use a socket to download a html page,
but how do I strip off the html code down to the plain text..
If you have a code, please let me know =)

Thanks..

#7235 18/01/03 08:29 PM
Joined: Jan 2003
Posts: 56
E
Babel fish
Offline
Babel fish
E
Joined: Jan 2003
Posts: 56
Code:
var %t = $regsub(<html>text<a haha="huhu">,/(<[^>]*>)/g,,%data)

%data will contain the plain text


Ecronika
My mIRC Addons
#7236 18/01/03 08:37 PM
Joined: Dec 2002
Posts: 1,321
H
Hoopy frood
Offline
Hoopy frood
H
Joined: Dec 2002
Posts: 1,321
Here is what I use for testing socket/HTML stuff. It lets you use raw HTTP, but you have to know HTTP commands.
Code:

alias www {
  if ($1 == -r) sockclose $2
  else sockopen $1 $1 80
}
on *:SOCKOPEN:*:{
  if ($sock($sockname).port == 80) window -Ek $+(@,$sockname)
}
on *:SOCKCLOSE:*: echo 4 -ti2 $+(@,$sockname) Remote host closed the connection.
on *:SOCKREAD:*:{
  if ($sockerr > 0) {
    echo $color(ctcp) -bflirt $+(@,$sockname) $+([,$sock($sockname).wserr,]) $sock($sockname).wsmsg
    return
  }
  elseif ($sock($sockname).port != 80) return
  var %string
  while (1) {
    sockread %string
    if ($sockbr == 0) return
    if (%string == $null) %string = -
    var %s, %n = $regsub(%string,/(^[^>]*>|<[^>]*>|<[^>]*$)/g,$null,%s)
    tokenize 32 $remove(%s, )
    if ($1) echo $color(info) -bflirt $+(@,$sockname) $1-
  }
}
on *:INPUT:@:{
  var %sockname = $right($active,-1)
  if ((/* iswm $1) || ($sock(%sockname).port != 80)) return
  if ($sock(%sockname).ip) {
    sockwrite -n %sockname $1-
    echo $color(action) -abflirt > $1-
    halt
  }
}

/www www.[/b]mirc.com
GET /index.html


DALnet: #HelpDesk and #m[color:#FF0000]IR[color:#EEEE00]C
#7237 20/01/03 01:13 PM
Joined: Dec 2002
Posts: 143
A
Vogon poet
Offline
Vogon poet
A
Joined: Dec 2002
Posts: 143
This strangely doesn't work for me (unless it's because I'm on the works network). I pasted the code and formatted it correctly (all ok!!) then typed

/www www.[/i]mirc.com

then a custom window (@www.mirc.com) opened with a large editbox. I then typed

GET /index.html

and nothing happened.

No error or anything. in fact, NOTHING at all!!!

Any suggestions?

(btw, I'm just getting an interest in sockets. Not done anything with them properly before, and just learning the basics!!!

Also, how, on the message boards, when I typed "[ color : brown ] [ b ] /www www.[/b]mirc.com[ /b ] [ /color ]" it still shows up as if I had written the website as a url [ url ] [ /url ] but I didn't do that! (without the spaces - did that so it showed the code)

[EDIT: [[/b]color:brown][[/b]b]/www www.[[/i]/i]mirc.com[[/b]/b][[/b]/color] -Hammer]

Last edited by Hammer; 20/01/03 09:47 PM.

Aubs.
cool


Link Copied to Clipboard