mIRC Home    About    Download    Register    News    Help

Print Thread
#73915 05/03/04 09:27 PM
S
shawnrgr
shawnrgr
S
im working on a good "findline" alias. i've found a few but seem to be sloooow. [ example: $findline(file.txt, *string*, 2) ]
some use /filter to dump all the matches in a temp window. and some use /fseek and the other /f... functions. which would be faster to find a line in a file? /fseek or /filter?
and if /fseek would probably faster, is it bad to constantly fopen\fclose a file? im talking probably like 50+ times a minute

#73916 05/03/04 09:51 PM
Joined: Dec 2002
Posts: 2,884
S
Hoopy frood
Offline
Hoopy frood
S
Joined: Dec 2002
Posts: 2,884
/f* will be much faster than /filter, and no it's not a good idea to continuously open/close files constantly like that. If you're accessing it that often you should simply load it into a hash table and save yourself a whole load of trouble.

#73917 05/03/04 09:53 PM
S
shawnrgr
shawnrgr
S
yea but the file is contantly updated, like probably at least 2 times every 10 seconds.so wouldn't i have to reload the file anyway?

#73918 05/03/04 10:07 PM
Joined: Dec 2002
Posts: 2,884
S
Hoopy frood
Offline
Hoopy frood
S
Joined: Dec 2002
Posts: 2,884
Well if the file is updated by mIRC scripts then you can just edit them to update the hash tables instead. If it's some other program then you're stuck with /f* I'm afraid.

#73919 05/03/04 10:38 PM
Joined: Jan 2003
Posts: 2,125
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,125
Well I'd expect /f* to be faster than /filter-ing to a hidden window but not much faster. In both cases the file is opened/closed once and all searches are done in memory.

The main difference that makes /f* faster is that /filter reads the entire file and dumps those lines that match $2 in the window, whereas /f* commands don't need to scan till the end of file; only till the Nth matching line. But this speed difference becomes smaller as N in $findfile(file,*wildstring*,N) grows. For N = <total number of matches> the difference is almost negligible: /fseek would have to scan the entire file, so it would essentially read the same amount of data as /filter.

However, no matter what you, me etc think is faster, the best way to make sure is to benchmark, and that's what the original poster should do.

#73920 05/03/04 11:12 PM
Joined: Dec 2002
Posts: 2,884
S
Hoopy frood
Offline
Hoopy frood
S
Joined: Dec 2002
Posts: 2,884
Well presumably if the alias is used enough the average number of lines that need to be read on any given file should average out to half the total number of lines. So ultimately /f* would be roughly twice as fast over a large number of uses, and even on an individual search which means the returned line is the last in the file it should be at least a teeny tiny bit faster because with /filter you must also iterate over the window afterwards.

As you said though, benchmarks should be made of both methods regardless of theory.

#73921 05/03/04 11:21 PM
S
shawnrgr
shawnrgr
S
i started puting it togeather. can you guys walk me in an ok direction or just let me know if im going about this the right way? here is what i have so far
Code:
;gline "get line" $gline(file,*string*,N) (if n=0 return total)   
gline {
  var %file = $1, %string = $2, %num = $3, %tot = 0
  if ($fopen(ff).fname) { .fclose ff } | .fopen ff %file
  if (%num == 0) {  
    while !$feof {
      .fseek -w ff %string
      if $fread(ff) { inc %tot }
    }
    .fclose(ff)
  if (%tot =&gt; 1) { return %tot }
  else { return $null }
  }
}

#73922 06/03/04 12:09 AM
Joined: Dec 2002
Posts: 2,884
S
Hoopy frood
Offline
Hoopy frood
S
Joined: Dec 2002
Posts: 2,884
Well I haven't used the /f* commands that much, so maybe I'm doing something stupid here, but I found the /filter equivalent to be much faster than /f* (and a hell of a lot more complex too).

Code:
;gline "get line" $gline(file,*string*,N) (if n=0 return total) 
gline {
  var %file = $1, %string = $2, %num = $int($3), %r = 0
  if ($fopen(ff)) .fclose ff
  .fopen ff %file
  if ($ferr) return $null
  if (%num == 0) { 
    while !$feof {
      .fseek -w ff %string
      if ($fread(ff)) inc %r
    }
  }
  else {
    var %line
    while !$feof {
      .fseek -w ff %string
      if ($fread(ff)) {
        inc %r
        %line = $ifmatch
        if %num == %r {
          var %r = %line
          break
        }
      }
    }
  }
  .fclose ff
  return %r
}

gfline {
  var %file = $1, %string = $2, %num = $int($3)
  window -h @gfline_find
  filter -fwc %file @gfline_find %string
  var %r = $line(@gfline_find, %num)
  window -c @gfline_find
  return %r
}


$gfline(), which uses /filter, works out over twice as fast in my benchmark (56ms vs. 122ms). Although it did cause some minor screen flicker which could get very annoying if it's being used a lot.

Edit: Missed a pair of braces

Last edited by starbucks_mafia; 06/03/04 12:18 AM.
#73923 06/03/04 12:23 AM
S
shawnrgr
shawnrgr
S
well. i did a bunch of benchmarks. the result????? /filter has proven to be much faster then /fseek. here was my echo:

-
filter : found 3003 : 40ms
-
seek : found 3003 : 520ms
-

#73924 06/03/04 12:51 AM
Joined: Dec 2002
Posts: 2,884
S
Hoopy frood
Offline
Hoopy frood
S
Joined: Dec 2002
Posts: 2,884
Just goes to show that things should always be benchmarked instead of relying on logic to decide which method to use. The parsing of the while loop just can't match /filter's internal system even when it's only operating on a fraction of the file.

#73925 06/03/04 12:51 AM
Joined: Jan 2003
Posts: 2,125
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,125
Agreed. I guess it was all about how much is "much" wink

Another thing I had in mind but somehow forgot to include in my previous post is the handling of N = 0. For this functionality, the /f* would have to scan the entire file and increment a variable in the while loop etc. This could even be slower than a filter -ff file.txt nul *matchtext* | return $filtered.

...because with /filter you must also iterate over the window afterwards
Actually you don't. You can do something like: filter -fw file.txt @hidden *matchtext* | return $line(@hidden,$3)


#73926 06/03/04 01:08 AM
Joined: Jan 2003
Posts: 2,125
Q
Hoopy frood
Offline
Hoopy frood
Q
Joined: Jan 2003
Posts: 2,125
These apparently contradictory results teach an important lesson in benchmarking: the relative performance of two or more routines greatly depends on the type of input. The aliases you checked would most probably show different results when applied to files of different length and using different matchtext. By observing the results, you can often construct models good enough to explain those results, but it's usually quite hard to make such good models beforehand, ie before the results of the benchmarks.

For example, $gline(versions-full.txt,*join*,0) is faster than $gfline(versions-full.txt,*join*,0) by about 10% here (versions-full.txt is the full versions.txt, which is 446 KB)

Last edited by qwerty; 06/03/04 01:09 AM.
#73927 06/03/04 01:10 AM
S
shawnrgr
shawnrgr
S
yea. its obviously good to benchmark. thanks for the help guys. one more thing. since i OBVIOUSLY have alot of info in this file (which is an ini by the way...) i thought i saw somewhere about "flushing" an ini frequently becuase ini's are stored in memory. (i think its /flushini) is this true? and if so, would it help increase performance if i was writing to this ini at an avg 30-50 times a minute?


Link Copied to Clipboard