|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
OP
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
Well, I'm working on a specialized "full" script for my channel and so I have yet another question. I was planning on saving data in a hash file to track top 10 information on a large volume of data. The problem is that there isn't a command for sorting a hash table (we should have $hsort ). And, if I save the table, it saves on separate lines rather than item=data or something similar on one line. So, I can't just easily sort it with /filter. I know I could run a script to put the item and data on one line, but that's time consuming (CPU consuming) for large amounts of data. I could also check all the items and sort that way inside the hash table itself, but that is also time/CPU consuming for large amounts of data. So, my question is... what is a quick method to sort data that isn't likely to lag mIRC or cause huge CPU spikes? Think of it like: Item: Nick Data: Score
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Feb 2004
Posts: 2,019
Hoopy frood
|
Hoopy frood
Joined: Feb 2004
Posts: 2,019 |
For items beneath 5k, it could still be very feasible to loop through the hash table, writing the item=data part to a file with the file handling commands, and issuing /filter on the file. Or you can loop through them and aline them to a hidden window, speed will be very similar.
However, as your data will grow, which it surely will, looping through tens of thousands of entries and writing them to a file, merely for the purposes of sorting will become inadequate.
I would recommend using a hidden window, with: item data.
Sorting will be easy with filter and the t flag.
Looking up data from an entry will be equally easy with: $fline(@win,$1 *,1)
Obviously, not using hash tables also has its drawbacks, being the lookup of data for a specified item. Whereas with a hash table this uses a hashing algorithm, with a hidden window this will use the linear technique.
I haven't benched the $fline, though on thousands of lines, it's obvious $hget will be much faster than $fline.
Yet another way is to keep using hash tables, hsaving the table to a file, doing a filter on the second column with the t flag and delimiter 13. Then you have a sorted file with the data in sorted order, followed by the items. Then you will need to loop through your hash table with $hfind to find the nickname that corresponds with the data.
Another way is also keeping your hash table, but saving it as an ini file, and then doing a filter on it with t flag again. Problems with this approach:
* hsave -i gets terribly slow if you have 500 items or more. * nicknames containing [ ] chars are transformed to ~ with no way of knowing what they originally represented. Imo this is the worst way.
However, you could $decode your nicknames, avoiding this [ ] -> ~ issue, but still, it doesn't help with the slowness of the saving to ini file format.
Those are four different methods, which all have their advantages and disadvantages. Needless to say, for looking up invididual data, the hash table approach will be far faster on thousands of lines of data, whereas the window approach will generate top stats very fast.
Gone.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
OP
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
Hm... I hadn't thought of using a hidden window. I'll have to look into that method. That may work very well with my top scores. It may be that I'll use a combination of hash table and hidden window if that ends up being faster. Thanks.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Sep 2003
Posts: 4,230
Hoopy frood
|
Hoopy frood
Joined: Sep 2003
Posts: 4,230 |
There is actually a clever little approch that can be taken , using hsave and filter. You save the hashtable with out itemnames at all using the /HSAVE -n option You then filter the file however you requier but also apply the /FILTER -n option this adds line numbers, but the line numbers of the source file not the result file Using this resulting file, you can now back refrence into the hashtable using the line number as the N in $hget(name, N).item and recover the itemname. This occurs becuase the /hsave -n saves the items in the same order as the $hget(name,N).item/data has them, (likely there in memory order?) ie: $hget(name,1).data is on line 1 and so on, so when you sort them using /filter as long as you have filters -n option to pass the line number they came from, you can then look them back up. Example code below MAKE1000 is just to make a 1000 item table with random scores of 1 to 500 alias make1000 {
hfree -sw hashtable
var %i = 1000
while (%i) {
hadd -m hashtable $+(item,%i) $rand(1,500)
dec %i
}
}
alias top10 {
hsave -no hashtable tempfile.txt
filter -ffcteun 1 32 tempfile.txt tempfile.txt
var %i = 1 | while (%i <= 10) {
var %itemname = $hget(hashtable,$gettok($read(tempfile.txt,nt,%i),1,32)).item
echo -a Top 10 $+(#,%i,) %itemname with a score of $hget(hashtable,%itemname)
inc %i
}
} Hope it helps.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
OP
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
That is really clever! I think I'll end up using that method. Thanks. Btw, why is it that /hsave -i is so much slower? I mean, the only real difference in the output file between /hsave -i and a normal hsave is line 1 (the section), and that every other $crlf is replaced with =.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Feb 2004
Posts: 2,019
Hoopy frood
|
Hoopy frood
Joined: Feb 2004
Posts: 2,019 |
Ha! The hsave -n saving the data in the same order as $hget(table,n).data is really a treat, I'll definitely put that to use in the future, gj
Gone.
|
|
|
|
Joined: Sep 2003
Posts: 4,230
Hoopy frood
|
Hoopy frood
Joined: Sep 2003
Posts: 4,230 |
I would assume its the internal mirc calls used to save them, likely each line passes through the same internal calls a writeini command would use. And if so then its recreating the file for each hashtable entery, this would seem to gel with the way it slows down more and more the bigger the table.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
OP
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
Hm... sounds like a feature request... change how it saves with -i.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Sep 2003
Posts: 4,230
Hoopy frood
|
Hoopy frood
Joined: Sep 2003
Posts: 4,230 |
I wouldnt bother, ini files are in my opion legacy item of windows, there slow and cant support all data formats, infact that maybe why there slow to save in, it might be a windows call thats been made to save the data (speculation here)
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
OP
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
It's not so much the INI file format that's useful, as that the item and data are on the same line.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Sep 2003
Posts: 4,230
Hoopy frood
|
Hoopy frood
Joined: Sep 2003
Posts: 4,230 |
well i already solved that for ya using /hsave -n and /filter -n but if u insist on wanting them in item=data order use this alias example {
var %file = tempfile.txt
hsave -o hashtable %file
var %size = $file(%file)
bread %file 0 %size &binvar
var %i = 0
while (%i < %size) {
var %i = $bfind(&binvar,%i,10)
bset &binvar %i 61
inc %i
var %i = $bfind(&binvar,%i,10)
inc %i
}
bwrite %file 0 -1 &binvar
write -il1 %file [hashtable]
} It simply saves in the generic itemname<newline>data<newline>itemname<newline>data<newline> order then reads the file back in and replaces every odd occrance of a 10 (linefeed) with a 61 =, and saves the file again, leaving you with itemname=data<newline>itemname=data<newline> etc etc. (oh and i added the ini header if its needed i wasnt sure). Its considerably faster
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
OP
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
No no... I like your original method and I'm using that from now on. I was just commenting that it would be nice for a simple /hsave -?? to be used to get a format that /filter will easily handle without extra coding.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Sep 2003
Posts: 4,230
Hoopy frood
|
Hoopy frood
Joined: Sep 2003
Posts: 4,230 |
I see what you mean, well thats not much code (the second one) and it would allow you to manipulate the stuff using filter and reinstert/refrence it at a later date.
The orginal way, while usefull, has one draw back, the hsave -n -> filter -n must be done imediately proir to your access of the hashtable and you aslo have to take care how you access your values, say you were wanting to sort the data into highest to lowest and then delete the lowest 50% of the scores, well u need to hsave -n to get the data filter -n to sort into smallest to largest order and get the line numbers added delete 2nd half of the file (the top 50%) filter the file in to linenumber order highest first loop through that file deleting said $hget(ht,N).item's
The items would need to be deleted from the highest number becuase any changes you start making to the file effect the accuaracy of the linenumbers matching to the N value
like if u wanted to delete lines 1 3 5 of a file, you actually delete lines 1 then 2 then 3, becuase the line numbers change when u delete one.
|
|
|
|
Joined: Oct 2004
Posts: 8,330
Hoopy frood
|
OP
Hoopy frood
Joined: Oct 2004
Posts: 8,330 |
Delete in reverse. That's actually how we work out trivia question deleting setup. We delete from the bottom up. Anyhow, I'll keep both in mind. I don't plan on deleting anything... just getting the top listing, so I'll just leave the first one you gave in the script this time. Thanks.
Invision Support #Invision on irc.irchighway.net
|
|
|
|
Joined: Apr 2003
Posts: 701
Hoopy frood
|
Hoopy frood
Joined: Apr 2003
Posts: 701 |
It's the other way around actually :tongue: If you wanted to delete lines 1,2,3 then just delete line1 3 times, if you want to delete lines 1,3,5 then delete lines 1,2,3, if you delete lines 1,3,5, then you have actually deleted lines 1,4,7
|
|
|
|
Joined: Sep 2003
Posts: 4,230
Hoopy frood
|
Hoopy frood
Joined: Sep 2003
Posts: 4,230 |
My original statment. like if u wanted to delete lines 1 3 5 of a file, you actually delete lines 1 then 2 then 3, becuase the line numbers change when u delete one. It's the other way around actually :tongue: ummm no If you wanted to delete lines 1,2,3 then just delete line1 3 times Not relevant. if you want to delete lines 1,3,5 then delete lines 1,2,3 Just what i said! if you delete lines 1,3,5, then you have actually deleted lines 1,4,7 Yes precisely what i was saying.
|
|
|
|
Joined: Apr 2003
Posts: 701
Hoopy frood
|
Hoopy frood
Joined: Apr 2003
Posts: 701 |
Oh, well, I read it correctly, just understood it wrong I reasoned: "If you wanted to delete 1,3,5 (and thus delete those lines), you actually delete lines 1 then 2 then 3" where you mean "like if u wanted to delete lines 1 3 5 of a file, you should actually delete lines 1 then 2 then 3" Oh well, misunderstandings happen, hopefully everyone understands now. (And if they don't, try stuff out then)
|
|
|
|
|