mIRC Home    About    Download    Register    News    Help

Print Thread
#144427 10/03/06 08:31 AM
Joined: Mar 2006
Posts: 11
A
Abe Offline OP
Pikka bird
OP Offline
Pikka bird
A
Joined: Mar 2006
Posts: 11
If u were to search a big text file (eg. 200k lines), what would be fastest method? filter or $read?

Joined: Aug 2004
Posts: 7,252
R
Hoopy frood
Offline
Hoopy frood
R
Joined: Aug 2004
Posts: 7,252
It's my understanding that /filter wouldbe the faster of the two methods for this. Of course, you could set a variable then test one method, record the time required, set a second variable (could even be the same one you used the first time), test the second method, then compare the time required for the second method to the first.

Joined: Feb 2006
Posts: 546
J
Fjord artisan
Offline
Fjord artisan
J
Joined: Feb 2006
Posts: 546
do you want to count the number of matches? or just check if a string exists

the quickest equivalent to chekcing if ($read(file,w,string) != $null) is of course

Code:
//filter -ff file nul string | if ($filtered) {


i just tested the two over 500 iterations a bunch of times, and found them to be extremely close together. didnt even bother testing further, i find it safe to conclude that $read is more suitable since its the more logical method for this purpose (the method of using an invalid outfile such as nul to speed it up in this case isnt documented, and /filter is usually used to deal with multiple lines) and shorter/simpler

if you wanted to further use the line, like if ($read(...)) { commands $v1 } of course the way i used /filter wouldnt make that as easy. youd need to filter to another location, and retrieve it blabla.. youd lose time there ;D

so to summarize, id use $read when its most suitable: when im only interested in the first match. id use /filter in cases where im looking for several lines


"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
Joined: Feb 2004
Posts: 2,019
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2004
Posts: 2,019
You cant say that /filter is faster than $read, as it entirely depends on what your goal is.

As far as hard disk efficiency goes, $read and filter are no different. They both open and close the file once. The difference is that /filter runs through all lines of the file and dumps the matching lines to some sort of output (a file, window or alias etc).

If you know this, $read can be faster than /filter (we'd have to bench it to be sure). To illustrate this: let's say you want to return 1 line from a file that matches a certain string. $read opens the file, and checks each line to see for a match, when it finds one it returns the result. Now lets say this result was found in the middle of the file (there were 5000 lines, so the 2500th).

Compare this to what you would do with filter: First of all, you need to create a window or a file, which will hold the filtered results. There is already overhead in this which $read doesn't suffer from, as it returns the result since it is an identifier. Second, /filter goes through all lines in the file (5000 instead of 2500), even though we were already satisfied with the line at line number 2500.

The point here is: you cannot say one is faster than the other, you'd have to take into account in which setting it is to be used, and to be certain you'd even have to bench.

What you can know on the other hand, is that a /filter will become more and more preferable over $read when there are multiple $read calls necessary. That's why while loops with $read are usually frowned upon by good scripters, as they know that with each call of $read, the file is opened and closed. Whereas with /filter, the file is opened once, all lines are gone through, and then the file is closed and you can do something with the results. If I have to choose between opening/closing a file 200 times in a row and opening/closing the file just once, then I know which method I'm going to prefer.

In conclusion, if he only wants to return 1 result from a file, then $read is the way to go, and not filter, as filter implies extra code to return the result to the place where you want it.

Note that if he is going to do frequent access to this file, a much better alternative would be a hidden window where he can then use $fline on. Or a hash table, where he can use $hfind on. Note that you can't take advantage of the hashing algorithm when you use $hfind. Instead, mIRC must loop through each item doing an iswm check on each of them until it finds a match, making it pretty much the same as what $fline does. I mention this, as some people might think hash tables are a magical thing that are always fast, though the truth is it depends on what you want to do and what kind of data you are holding.


Gone.
Joined: Sep 2003
Posts: 4,230
D
Hoopy frood
Offline
Hoopy frood
D
Joined: Sep 2003
Posts: 4,230
As FiberOptics has said, it really comes down to what ya doing.

If you need to just find if the words in the file then use $read(file,ntw,*word*)

If you need a total number of matches use filter -ff file nul *word* and then $filtered has the matches

If you need a list of matching lines use filter -fwc file @win *word* and then $filtered has the matches And @win holds the list (u need to make the window first window -h @win makes a nice hidden window

Heres one i sometimes need...
If you need a list of the matches AND the line numbers there on in the file then use filter -fwcn file @win *word*
This gives you the list of matching lines but at the start of each line is the linenumber its from in the file followed by a space. ie: <linenumber> <line in file>


Link Copied to Clipboard