mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: May 2006
Posts: 93
Babel fish
OP Offline
Babel fish
Joined: May 2006
Posts: 93
I've to read a part of an html page I receive using socket. The html page has 5-6 long lines so I've to use the binvars to store data. I use bfind to search the part I've to read (the begin is <div id="foo"> the end is the next </div>). The script works if both the begin and the end are in the same binvar (I use sockread &var, so the &var is overwritten each time) but if </div> is on the next &var it doesn't work.
I tried to join the chunk that I read using bcopy but it didn't work so i write all the page in a file but now i don't know how to search <div id="foo">, if I use /fseek with -w it returns a wrong position and I also don't know how to read N bytes from the fpointer using $fread. I should copy again the file in a binvar, but I don't know if I can copy the whole file in a single binvar...
I've to find a way to read the text between <div id="foo"> and </div>, if you have any idea let me know.

Joined: May 2006
Posts: 93
Babel fish
OP Offline
Babel fish
Joined: May 2006
Posts: 93
Ok, I solved the problem reading the position of <div id="foo"> from the &binvar, copying some bytes from that position in the file in another &binvar and cutting the &binvar where I find the </div>.
Maybe is a little complicated but it seems to work fine.

Joined: Sep 2005
Posts: 2,881
H
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,881
Another way using files and $fgetc (gets one character at a time). Kinda kludgy.

Code:
alias test {
  .fopen -no test test.dat
  if ($fopen(test)) {
    .fwrite -n test xxx<div id="foo">bar hello world, you should see this</div>xxx
    .fseek test 0
    var %data = $fgetc(test), %now
    while (!$feof) {
      if (!%now) {
        %data = %data $fgetc(test) 
        if ($gettok(%data,$calc($numtok(%data,32) - 13) $+ -,32) == 60 100 105 118 32 105 100 61 34 102 111 111 34 62) { 
          %now = $true 
          %data =
        }
      }
      else {
        %data = %data $fgetc(test)
        if ($gettok(%data,$calc($numtok(%data,32) - 5) $+ -,32) == 60 47 100 105 118 62) { 
          %data = $deltok(%data,$calc($numtok(%data,32) - 5) $+ -,32)
          bset &data 1 %data
          echo -a ~ $bvar(&data,1-).text
          break 
        }
      }
    }
    .fclose test
  }
}

Joined: Jan 2003
Posts: 2,523
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,523
This sounds fine, but you could make it a little cleaner; instead of copying 'some' bytes and then looking for </div>, why not search for </div> in the binvar starting from the position of <div ...>. Here's a general-purpose example of this:
Code:
alias gettext {
  var %b = &gettext $+ $ticks
  bread $qt($1) 0 $file($1) %b
  if $bfind(%b,1,$2) {
    var %pos = $v1 + $len($2)
    bcopy -c %b 1 %b %pos $calc($bfind(%b,%pos,$3) - %pos)
    ; %b stores the name of the binvar holding the desired text
    return %b
  }
}
You can test it with something like:
//var %binvar = $gettext(file.html,<div id="foo">,</div>) | echo -ag $bvar(%binvar,1-).text

Bear in mind that this won't work if
- there are children divs in the 'foo' one
- there is extra white space inside the div tag you're looking for, eg
<div id = "foo" >
or
<div
id="foo">

etc. A working version that takes both into account would have to be lengthier/more complex, but it's probably not necessary in this case (about the second issue, if the format changes in the future, you can always pass different parameters to the identifier).


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com

Link Copied to Clipboard