|
Joined: Jan 2003
Posts: 109
Vogon poet
|
OP
Vogon poet
Joined: Jan 2003
Posts: 109 |
Hi - I have a script to send new news headlines from a particular web page to a channel. The headlines are being retrieved ok, but I'm trying to make it so only a new story makes it to the channel. I've made a check variable to see if it's the last story or not, but the problem is it keeps reading down the website and sets an older item... Any ideas how I can get this to just read the first instance of my match string and not ones further down the webpage? Thanks!
on 1:sockread:abctest:{ if ($sockerr > 0) return :nextread sockread %temp if ($sockbr == 0) return if (%temp == $null) %temp = - if (<div class="black9pt"> isin %temp) { set %abctest $left($gettok(%temp,3-,62),-36) if (%abctest.last != %abctest) { set %abctest.last %abctest { msg #Channel %abctest sockclose abctest } } } }
|
|
|
|
Joined: Mar 2003
Posts: 1,271
Hoopy frood
|
Hoopy frood
Joined: Mar 2003
Posts: 1,271 |
Would be easier with the socket connection data so we could test, but this line is bad either way:
set %abctest.last %abctest {
the set command does not allow for the opening of brackets
DALnet #Helpdesk I hear and I forget. I see and I remember. I do and I understand. -Confucius
|
|
|
|
Joined: Jan 2003
Posts: 109
Vogon poet
|
OP
Vogon poet
Joined: Jan 2003
Posts: 109 |
thanks - the first part of the script is:
alias abctest { sockopen abctest abcnews.go.com 80 } on 1:sockopen:abctest:{ sockwrite -n abctest GET /wire/world/index.html HTTP/1.0 sockwrite -n abctest Host: abcnews.go.com sockwrite -n abctest Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* sockwrite -n abctest Accept-Language: en sockwrite -n abctest Proxy-Connection: Close sockwrite -n abctest User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98) sockwrite -n abctest Pragma: no-cache sockwrite abctest $crlf }
|
|
|
|
Joined: Mar 2003
Posts: 1,271
Hoopy frood
|
Hoopy frood
Joined: Mar 2003
Posts: 1,271 |
Ah, I should have turned on my braincells earlier. The reason for your error is that when the script sets a new headline, the next headline it reads suddenly also makes the if-statement true, so it always resets the %abctest.last to the last headline retrieved. One way of fixing this: - set a variable (%tmp) when you open the socket
- only perform the if statement for a newer headline if the variable exists
- if a newer is found, change the newest-headline-variable, unset the variable %tmp
------------------------------------------------------------ no guarantees on this code, I can't test it here
alias abctest {
sockopen abctest abcnews.go.com 80
}
on *:sockopen:abctest: {
sockwrite -n abctest GET /wire/world/index.html HTTP/1.0
sockwrite -n abctest Host: abcnews.go.com
sockwrite -n abctest Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
sockwrite -n abctest Accept-Language: en
sockwrite -n abctest Proxy-Connection: Close
sockwrite -n abctest User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
sockwrite -n abctest Pragma: no-cache
sockwrite abctest $crlf
[color:red]set %tmp 1[/color]
}
on *:sockread:abctest: {
if ($sockerr > 0) return
sockread %temp
if ($sockbr == 0) return
if (%temp) {
if (<div class="black9pt"> isin %temp) [color:red]&& (%tmp)[/color] {
set %abctest $left($gettok(%temp,3-,62),-36)
if (%abctest.last != %abctest) {
set %abctest.last %abctest
msg #Channel %abctest
[color:red]unset %tmp[/color]
}
}
}
sockclose abctest
}
DALnet #Helpdesk I hear and I forget. I see and I remember. I do and I understand. -Confucius
|
|
|
|
Joined: Jan 2003
Posts: 109
Vogon poet
|
OP
Vogon poet
Joined: Jan 2003
Posts: 109 |
thanks LocutusofBorg - same problem it seems, it's alternating between the top 3 or so headlines and putting them in instead... pity the strings are all the same otherwise this wouldn't be a problem! arghhh
|
|
|
|
Joined: Jan 2003
Posts: 109
Vogon poet
|
OP
Vogon poet
Joined: Jan 2003
Posts: 109 |
got around the problem by writing the line to a txt file and comparing off that - not ideal I guess but does the job very well! thanks for the help
|
|
|
|
Joined: Mar 2003
Posts: 1,271
Hoopy frood
|
Hoopy frood
Joined: Mar 2003
Posts: 1,271 |
Ok, I think I know what the problem is, and I found a way to fix it. I *think* (judging from my tests) that the socket is receiving information faster than then script can process it. I tried several different ways of ending the sockread if a new headline is found, but one or two extra still got through. My solution: write all the topics to a window, and when all is done, compare the first topic in the window with the saved one. Not the most garcious, but it works.
[color:green]; open the socket[/color]
alias abctest { sockopen abctest abcnews.go.com 80 }
[color:green] [/color]
on *:sockopen:abctest: {
[color:green]; HTTP protocol stuff[/color]
sockwrite -n $sockname GET /wire/world/index.html HTTP/1.0
sockwrite -n $sockname User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
sockwrite -n $sockname Connection: close
sockwrite -n $sockname host: abcnews.go.com
sockwrite -n $sockname Accept: */*
sockwrite -n $sockname $crlf
[color:green]; open a window and hide it, we don't need to see it[/color]
window -h @abc_tmp
}
[color:green] [/color]
on *:sockread:abctest: {
sockread %abc_temp
[color:green]; only proceed if the line has a valid value[/color]
if (*div*class="black9pt"* iswm %abc_temp) {
[color:green]; remove all HTML codes[/color]
%abc_temp = $removehtml(%abc_temp)
[color:green]; tokenize so I can use $1 $2 etc[/color]
tokenize 32 %abc_temp
[color:green]; read all but the last two tokens (which is the time it was posted)[/color]
var %abc_headline = $eval($+($,1-,$calc($0 - 2)),2)
[color:green]; write to the window[/color]
aline @abc_tmp %abc_headline
}
}
[color:green] [/color]
on *:SOCKCLOSE:abctest: {
[color:green]; compare window's first line to saved headline and save when necessary[/color]
if ($line(@abc_tmp,1) != %headline.last) set %headline.last $line(@abc_tmp,1)
[color:green]; cleaning up the mess[/color]
close -@ @abc_tmp
unset %abc_temp
}
[color:green] [/color]
[color:green]; a long time ago, in a galaxy far, far away, Hammer gave me a regex to remove HTML codes[/color]
alias removehtml {
var %return, %regex = $regsub($1-,/(^[^>]*>|<[^>]*>|<[^<]*$)/g,$null,%return)
return $remove(%return,$chr(9),&nbsp;)
}
DALnet #Helpdesk I hear and I forget. I see and I remember. I do and I understand. -Confucius
|
|
|
|
Joined: Mar 2003
Posts: 1,271
Hoopy frood
|
Hoopy frood
Joined: Mar 2003
Posts: 1,271 |
I see you had the same idea I had I just used a window cause writing/reading to and from it is faster than with a file.
DALnet #Helpdesk I hear and I forget. I see and I remember. I do and I understand. -Confucius
|
|
|
|
Joined: Jan 2003
Posts: 109
Vogon poet
|
OP
Vogon poet
Joined: Jan 2003
Posts: 109 |
thanks again LocutusofBorg for the help - all working very nicely now - I have happy customers!
|
|
|
|
|