mIRC Home    About    Download    Register    News    Help

Print Thread
/filtering vs using file handles.... #220840 30/04/10 10:07 PM
Joined: Apr 2003
Posts: 342
M
MeStinkBAD Offline OP
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
A long time ago... in a galaxy far away... a thread contained the following remark...

Originally Posted By: Riamus2
Before throwing broad generalizations out about how it's useless and not good to do and so on, why not try using it with /filter on a very large text file and compare your results. You'll find that there can be a significant difference in speed on large files (i.e. not just a few milliseconds).



Okay this maybe beating a dead horse but please show me something to support this. PLEASE! This is FFXI game log I've been testing on and I'm not getting a noticeable difference. I mean it contains 170,000 lines! What am I missing? I don't understand this obsession with /filter!

Last edited by MeStinkBAD; 30/04/10 10:30 PM.

Beware of MeStinkBAD! He knows more than he actually does!
Re: /filtering vs using file handles.... [Re: MeStinkBAD] #220859 01/05/10 01:36 PM
Joined: Dec 2002
Posts: 4,521
Khaled Offline
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 4,521
Ideally we would need to see the script that you are using to compare both methods - this will allow us to test it out, to see if we get the same results.

Re: /filtering vs using file handles.... [Re: MeStinkBAD] #220882 01/05/10 08:50 PM
Joined: Sep 2005
Posts: 2,876
H
hixxy Offline
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,876
Code:
alias filetest {
  var %ticks = $ticks
  .fopen filetest $1-
  if (!$ferr) {
    window -h @test
    while (!$feof) aline @test . $fread(filetest)
    window -c @test
  }
  .fclose filetest
  echo -a File handling commands time taken: $calc($ticks - %ticks) ms.
  return
  :error
  if (*command halted* iswm $error) {
    close -@ @test
    if ($fopen(filetest)) .fclose $v1
  }
}
alias filtertest {
  var %ticks = $ticks
  window -h @test
  filter -fw $1- @test
  window -c @test
  echo -a Filter command time taken: $calc($ticks - %ticks) ms.
}
alias test {
  var %file = $qt($scriptdir $+ 2007.07.fflog.txt)
  filetest %file
  filtertest %file
}


File handling commands time taken: 13172 ms.
Filter command time taken: 4172 ms.
File handling commands time taken: 13093 ms.
Filter command time taken: 4094 ms.
File handling commands time taken: 13141 ms.
Filter command time taken: 4094 ms.

Filter is just over three times as fast each time.

Now the test is not 100% fair you could say, because with the file handling one I've added a "." to each line.. the reason for this is that some of your lines are blank and it seemed quicker to add a . than to check each line exists with if ().

Here's one that uses if () for arguments sake:

Code:
alias filetest {
  var %ticks = $ticks
  .fopen filetest $1-
  if (!$ferr) {
    window -h @test
    while (!$feof) {
      if ($fread(filetest)) aline @test $v1
    }
    window -c @test
  }
  .fclose filetest
  echo -a File handling commands time taken: $calc($ticks - %ticks) ms.
  return
  :error
  if (*command halted* iswm $error) {
    close -@ @test
    if ($fopen(filetest)) .fclose $v1
  }
}
alias filtertest {
  var %ticks = $ticks
  window -h @test
  filter -fw $1- @test
  window -c @test
  echo -a Filter command time taken: $calc($ticks - %ticks) ms.
}
alias test {
  var %file = $qt($scriptdir $+ 2007.07.fflog.txt)
  filetest %file
  filtertest %file
}


File handling commands time taken: 13922 ms.
Filter command time taken: 4093 ms.
File handling commands time taken: 13953 ms.
Filter command time taken: 4094 ms.
File handling commands time taken: 13938 ms.
Filter command time taken: 4094 ms.

Still three times faster..

And here's one that filters to NUL and uses /noop in the file handling commands:

Code:
alias filetest {
  var %ticks = $ticks
  .fopen filetest $1-
  if (!$ferr) {
    while (!$feof) noop $fread(filetest)
  }
  .fclose filetest
  echo -a File handling commands time taken: $calc($ticks - %ticks) ms.
  return
  :error
  if (*command halted* iswm $error) {
    close -@ @test
    if ($fopen(filetest)) .fclose $v1
  }
}
alias filtertest {
  var %ticks = $ticks
  filter -ff $1- NUL
  echo -a Filter command time taken: $calc($ticks - %ticks) ms.
}
alias test {
  var %file = $qt($scriptdir $+ 2007.07.fflog.txt)
  filetest %file
  filtertest %file
}


In this third test the file handling actually isn't that much slower than filter, but filter still takes the crown:

File handling commands time taken: 7031 ms.
Filter command time taken: 5125 ms.
File handling commands time taken: 7047 ms.
Filter command time taken: 5125 ms.
File handling commands time taken: 7031 ms.
Filter command time taken: 5141 ms.
File handling commands time taken: 7031 ms.
Filter command time taken: 5125 ms.
File handling commands time taken: 7031 ms.
Filter command time taken: 5109 ms.

I think the major benefit of /filter is actually that it's less code to write, not so much that it's so much speedier than using the file handling routines.

Re: /filtering vs using file handles.... [Re: hixxy] #220885 01/05/10 10:25 PM
Joined: Oct 2004
Posts: 8,327
Riamus2 Offline
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,327
You seem to be proving that filtering IS faster.

Your original quote from that thread:
Quote:
If you are concerned about speed then don't use /filter for reading files. And speed is pretty irrelevant these days...


So let's see... in 2 tests, /filter is a little over 3x faster (9 seconds less). In the last test, it's still almost 2 seconds faster. So depending on what you're doing with large files, /filter is worth doing if you are concerned about speed. That was why you were told to run a test before saying you shouldn't use /filter if you're concerned about speed.

As far as using NUL goes, it's still faster than using file handling methods in your test. As to whether or not using NUL with /filter or using another method with /filter, you should try a test to see the comparison /filter to /filter with the only difference being NUL.

If there was something different you were trying to point out other than how /filtering is faster (as was stated in the first place), please explain.

Last edited by Riamus2; 01/05/10 10:29 PM.

Invision Support
#Invision on irc.irchighway.net
Re: /filtering vs using file handles.... [Re: Riamus2] #220888 02/05/10 01:13 AM
Joined: Sep 2005
Posts: 2,876
H
hixxy Offline
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,876
You appear to be getting me confused with the OP grin

Re: /filtering vs using file handles.... [Re: hixxy] #220892 02/05/10 02:20 AM
Joined: Aug 2004
Posts: 7,252
R
RusselB Offline
Hoopy frood
Offline
Hoopy frood
R
Joined: Aug 2004
Posts: 7,252
He may have just hit the Reply button not realizing, or not knowing, that you name would be directly associated with the reply, rather than having it as a reply to no one specific person.

Re: /filtering vs using file handles.... [Re: hixxy] #220897 02/05/10 04:28 AM
Joined: Oct 2004
Posts: 8,327
Riamus2 Offline
Hoopy frood
Offline
Hoopy frood
Joined: Oct 2004
Posts: 8,327
Lol, yes, I did. I didn't look at the name apparently. I thought the OP was responding with his example. Now it makes sense that it proves what we were saying. smile


Invision Support
#Invision on irc.irchighway.net
Re: /filtering vs using file handles.... [Re: Riamus2] #220898 02/05/10 05:32 AM
Joined: Apr 2003
Posts: 342
M
MeStinkBAD Offline OP
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
Code:
alias filetest {
  var %lines = $lines($1-), %ticks = $ticks
  .fopen filetest $1-
  if (!$ferr) {
    /fseek -l filetest %lines
  }
  .fclose filetest
  echo -a File handling commands time taken: $calc($ticks - %ticks) ms.
  return
  :error
  if (*command halted* iswm $error) {
    close -@ @test
    if ($fopen(filetest)) .fclose $v1
  }
}
alias filtertest {
  var %ticks = $ticks
  filter -ff $1- NUL
  echo -a Filter command time taken: $calc($ticks - %ticks) ms.
}
alias benchtest {
  var %file = $qt($my.ff.log)
  filetest %file
  filtertest %file
}


File handling commands time taken: 2313 ms.
Filter command time taken: 2984 ms.

Hmmmm... you people don't know how to use /fseek!

I don't care if you choose Hixxy's way or mine. Both show that the operation itself is not the bottleneck. Using /fseek <handle> $file(<file>).size will take less than a millisecond to execute. You need perform this operation every time you open a file to append to it. The bottleneck is the interpreter. And 170,000 in 5 seconds comes to 34,000 lines per second. "Servers.ini" uses maybe 1000. The default setting for the window buffer is 5000 lines. So why the obsession with /filter?

Last edited by MeStinkBAD; 02/05/10 09:14 AM.

Beware of MeStinkBAD! He knows more than he actually does!
Re: /filtering vs using file handles.... [Re: MeStinkBAD] #220905 02/05/10 01:11 PM
Joined: Sep 2005
Posts: 2,876
H
hixxy Offline
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,876
Erm, there's a slight difference in what those two tests are doing:

/fseek is just positioning the file pointer at the end of the file. /filter is painstakingly processing each line and sending it to the NUL device.

See the difference? And the fact that /filter is still almost as fast as /fseek alone, clearly shows that /filter is a lot faster.

If you want to test the two fairly, you have to have them do the same thing and see which one does it faster.

If you wanted to READ the data from the file, then simply /fseek'ing to the end would not be that useful, would it? Even if the bottleneck is the interpreter (which I don't think anybody ever denied?), the fact is it's quicker and easier to run through the contents of a file using /filter than it is with fopen/$fread/fclose.

Re: /filtering vs using file handles.... [Re: hixxy] #220911 02/05/10 04:54 PM
Joined: Apr 2003
Posts: 342
M
MeStinkBAD Offline OP
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
/fseek -l <handle> reads every character in the file up to the desired line. It has too. It has to search for every EOL, increment a counter, and check if it's on the desired line.

/fseek alone (without any switches) doesn't. It's instantaneous. I mean 0 ms. No time at all. Go ahead. Try /fseek <handle> $file(<file>).size. Or /fseek <handle> $r(1,$file(<file>).size) Which ever. Just type it in the command line.

/filter forces you to output to something. And NUL is a device to send data you want to discard. It always returns EOF on read, regardless what you "write" to it.

The end result is the same using /fseek -l or /filter > NUL. If two operations produce the end result, you've accomplished the same thing.

I honestly think this is ironic... I included -l to *be^ fair. If I didn't the result would have been...

File handling commands time taken: 0 ms.
Filter command time taken: 2984 ms.

Last edited by MeStinkBAD; 02/05/10 04:59 PM.

Beware of MeStinkBAD! He knows more than he actually does!
Re: /filtering vs using file handles.... [Re: MeStinkBAD] #220913 02/05/10 05:36 PM
Joined: Jul 2006
Posts: 3,558
W
Wims Offline
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 3,558
I think you're really missing something, I suggest you re-read what has been said.

You shouln't make false statement, fseek doesn't read anything at all, /fseek and /filter aren't related in any ways neither
Originally Posted By: /help /fseek
/fseek <name> <position>
Sets the read/write pointer to the specified position in the file. The following switches may also be used to move the file pointer:




Looking for a good help channel about mIRC? Check #mircscripting @ irc.swiftirc.net
Re: /filtering vs using file handles.... [Re: Wims] #220917 02/05/10 07:18 PM
Joined: Apr 2003
Posts: 342
M
MeStinkBAD Offline OP
Fjord artisan
OP Offline
Fjord artisan
M
Joined: Apr 2003
Posts: 342
/fseek <name> <position>
Sets the read/write pointer to the specified position in the file. The following switches may also be used to move the file pointer:

  • -l <name> <line number>
  • -n <name>
  • -w <name> <wildcard>
  • -r <name> <regex>

If it didn't read the file, then how could it find the position of a wildcard string in the file? /fseek without any switches does not read the file. It just calls the low level I/O call lseek (or equivalent). With switches it must. But there is no low level I/O call for searching for strings or characters, let alone PCRE.

Do you understand?

Re: /filtering vs using file handles.... [Re: MeStinkBAD] #220919 02/05/10 07:47 PM
Joined: Sep 2005
Posts: 2,876
H
hixxy Offline
Hoopy frood
Offline
Hoopy frood
H
Joined: Sep 2005
Posts: 2,876
When I said read, I meant that the data read would be available to you to use in your script - /fseek does not make the read data available.

With your /fseek example there is no extra overhead with having to fopen() to NUL and fwrite() each line, then fclose(), like there is with /filter. Even though the results are discarded, mIRC will still wait for fwrite() to return a value before moving onto the next line.

And again - this is a very silly example, because there's little point in reading all of the data in a file but not doing anything with it.

Try coming up with practical examples and /filter will win nearly every time... like in my examples earlier in the thread smile (reading the contents of a file into an @window)

You're right in saying that the biggest bottleneck is the interpreter, but that still means /filter is more efficient than the file handling command alternative.