mIRC Home    About    Download    Register    News    Help

Topic Options
#135236 - 09/11/05 04:40 PM Sorting Hash Tables
Riamus2 Offline
Planetary brain

Registered: 13/10/04
Posts: 8327
Loc: MA, USA
Well, I'm working on a specialized "full" script for my channel and so I have yet another question. smile

I was planning on saving data in a hash file to track top 10 information on a large volume of data. The problem is that there isn't a command for sorting a hash table (we should have $hsort smile). And, if I save the table, it saves on separate lines rather than item=data or something similar on one line. So, I can't just easily sort it with /filter.

I know I could run a script to put the item and data on one line, but that's time consuming (CPU consuming) for large amounts of data. I could also check all the items and sort that way inside the hash table itself, but that is also time/CPU consuming for large amounts of data.

So, my question is... what is a quick method to sort data that isn't likely to lag mIRC or cause huge CPU spikes?

Think of it like:

Item: Nick
Data: Score
_________________________
Invision Support
#Invision on irc.irchighway.net

Top
#135237 - 09/11/05 05:53 PM Re: Sorting Hash Tables
FiberOPtics Offline
Hoopy frood

Registered: 05/02/04
Posts: 2019
Loc: Leuven, Belgium
For items beneath 5k, it could still be very feasible to loop through the hash table, writing the item=data part to a file with the file handling commands, and issuing /filter on the file. Or you can loop through them and aline them to a hidden window, speed will be very similar.

However, as your data will grow, which it surely will, looping through tens of thousands of entries and writing them to a file, merely for the purposes of sorting will become inadequate.

I would recommend using a hidden window, with: item data.

Sorting will be easy with filter and the t flag.

Looking up data from an entry will be equally easy with: $fline(@win,$1 *,1)

Obviously, not using hash tables also has its drawbacks, being the lookup of data for a specified item. Whereas with a hash table this uses a hashing algorithm, with a hidden window this will use the linear technique.

I haven't benched the $fline, though on thousands of lines, it's obvious $hget will be much faster than $fline.

Yet another way is to keep using hash tables, hsaving the table to a file, doing a filter on the second column with the t flag and delimiter 13. Then you have a sorted file with the data in sorted order, followed by the items. Then you will need to loop through your hash table with $hfind to find the nickname that corresponds with the data.

Another way is also keeping your hash table, but saving it as an ini file, and then doing a filter on it with t flag again. Problems with this approach:

* hsave -i gets terribly slow if you have 500 items or more.
* nicknames containing [ ] chars are transformed to ~ with no way of knowing what they originally represented. Imo this is the worst way.

However, you could $decode your nicknames, avoiding this [ ] -> ~ issue, but still, it doesn't help with the slowness of the saving to ini file format.

Those are four different methods, which all have their advantages and disadvantages. Needless to say, for looking up invididual data, the hash table approach will be far faster on thousands of lines of data, whereas the window approach will generate top stats very fast.
_________________________
Gone.

Top
#135238 - 09/11/05 06:02 PM Re: Sorting Hash Tables
Riamus2 Offline
Planetary brain

Registered: 13/10/04
Posts: 8327
Loc: MA, USA
Hm... I hadn't thought of using a hidden window. I'll have to look into that method. That may work very well with my top scores. It may be that I'll use a combination of hash table and hidden window if that ends up being faster. Thanks. smile
_________________________
Invision Support
#Invision on irc.irchighway.net

Top
#135239 - 10/11/05 05:02 AM Re: Sorting Hash Tables
DaveC Offline
Planetary brain

Registered: 26/09/03
Posts: 4230
There is actually a clever little approch that can be taken , using hsave and filter.

You save the hashtable with out itemnames at all using the /HSAVE -n option
You then filter the file however you requier but also apply the /FILTER -n option this adds line numbers, but the line numbers of the source file not the result file
Using this resulting file, you can now back refrence into the hashtable using the line number as the N in $hget(name, N).item and recover the itemname.

This occurs becuase the /hsave -n saves the items in the same order as the $hget(name,N).item/data has them, (likely there in memory order?) ie: $hget(name,1).data is on line 1 and so on, so when you sort them using /filter as long as you have filters -n option to pass the line number they came from, you can then look them back up.

Example code below
MAKE1000 is just to make a 1000 item table with random scores of 1 to 500
Code:
alias make1000 {
  hfree -sw hashtable
  var %i = 1000
  while (%i) {
    hadd -m hashtable $+(item,%i) $rand(1,500)
    dec %i
  }
}
alias top10 {
  hsave -no hashtable tempfile.txt
  filter -ffcteun 1 32 tempfile.txt tempfile.txt
  var %i = 1 | while (%i <= 10) {
    var %itemname = $hget(hashtable,$gettok($read(tempfile.txt,nt,%i),1,32)).item
    echo -a Top 10 $+(#,%i,) %itemname with a score of $hget(hashtable,%itemname)
    inc %i
  }
}


Hope it helps.

Top
#135240 - 10/11/05 02:31 PM Re: Sorting Hash Tables
Riamus2 Offline
Planetary brain

Registered: 13/10/04
Posts: 8327
Loc: MA, USA
That is really clever! I think I'll end up using that method. Thanks. laugh

Btw, why is it that /hsave -i is so much slower? I mean, the only real difference in the output file between /hsave -i and a normal hsave is line 1 (the section), and that every other $crlf is replaced with =.
_________________________
Invision Support
#Invision on irc.irchighway.net

Top
#135241 - 10/11/05 03:35 PM Re: Sorting Hash Tables
FiberOPtics Offline
Hoopy frood

Registered: 05/02/04
Posts: 2019
Loc: Leuven, Belgium
Ha!

The hsave -n saving the data in the same order as $hget(table,n).data is really a treat, I'll definitely put that to use in the future, gj smile
_________________________
Gone.

Top
#135242 - 10/11/05 09:41 PM Re: Sorting Hash Tables
DaveC Offline
Planetary brain

Registered: 26/09/03
Posts: 4230
I would assume its the internal mirc calls used to save them, likely each line passes through the same internal calls a writeini command would use.
And if so then its recreating the file for each hashtable entery, this would seem to gel with the way it slows down more and more the bigger the table.

Top
#135243 - 10/11/05 09:48 PM Re: Sorting Hash Tables
Riamus2 Offline
Planetary brain

Registered: 13/10/04
Posts: 8327
Loc: MA, USA
Hm... sounds like a feature request... change how it saves with -i. smile
_________________________
Invision Support
#Invision on irc.irchighway.net

Top
#135244 - 10/11/05 09:56 PM Re: Sorting Hash Tables
DaveC Offline
Planetary brain

Registered: 26/09/03
Posts: 4230
I wouldnt bother, ini files are in my opion legacy item of windows, there slow and cant support all data formats, infact that maybe why there slow to save in, it might be a windows call thats been made to save the data (speculation here)

Top
#135245 - 11/11/05 03:45 PM Re: Sorting Hash Tables
Riamus2 Offline
Planetary brain

Registered: 13/10/04
Posts: 8327
Loc: MA, USA
It's not so much the INI file format that's useful, as that the item and data are on the same line.
_________________________
Invision Support
#Invision on irc.irchighway.net

Top
#135246 - 11/11/05 04:21 PM Re: Sorting Hash Tables
DaveC Offline
Planetary brain

Registered: 26/09/03
Posts: 4230
well i already solved that for ya using /hsave -n and /filter -n

but if u insist on wanting them in item=data order use this

Code:
alias example {
  var %file = tempfile.txt
  hsave -o hashtable %file
  var %size = $file(%file)
  bread %file 0 %size &binvar
  var %i = 0
  while (%i < %size) {
    var %i = $bfind(&binvar,%i,10)
    bset &binvar %i 61
    inc %i
    var %i = $bfind(&binvar,%i,10)
    inc %i
  }
  bwrite %file 0 -1 &binvar
  write -il1 %file [hashtable]
}


It simply saves in the generic itemname<newline>data<newline>itemname<newline>data<newline> order then reads the file back in and replaces every odd occrance of a 10 (linefeed) with a 61 =, and saves the file again, leaving you with itemname=data<newline>itemname=data<newline> etc etc. (oh and i added the ini header if its needed i wasnt sure).

Its considerably faster smile

Top
#135247 - 11/11/05 04:26 PM Re: Sorting Hash Tables
Riamus2 Offline
Planetary brain

Registered: 13/10/04
Posts: 8327
Loc: MA, USA
No no... I like your original method and I'm using that from now on. I was just commenting that it would be nice for a simple /hsave -?? to be used to get a format that /filter will easily handle without extra coding. smile
_________________________
Invision Support
#Invision on irc.irchighway.net

Top
#135248 - 11/11/05 09:03 PM Re: Sorting Hash Tables
DaveC Offline
Planetary brain

Registered: 26/09/03
Posts: 4230
I see what you mean, well thats not much code (the second one) and it would allow you to manipulate the stuff using filter and reinstert/refrence it at a later date.

The orginal way, while usefull, has one draw back, the hsave -n -> filter -n must be done imediately proir to your access of the hashtable and you aslo have to take care how you access your values, say you were wanting to sort the data into highest to lowest and then delete the lowest 50% of the scores, well u need to
hsave -n to get the data
filter -n to sort into smallest to largest order and get the line numbers added
delete 2nd half of the file (the top 50%)
filter the file in to linenumber order highest first
loop through that file deleting said $hget(ht,N).item's

The items would need to be deleted from the highest number becuase any changes you start making to the file effect the accuaracy of the linenumbers matching to the N value

like if u wanted to delete lines 1 3 5 of a file, you actually delete lines 1 then 2 then 3, becuase the line numbers change when u delete one.

Top
#135249 - 11/11/05 09:13 PM Re: Sorting Hash Tables
Riamus2 Offline
Planetary brain

Registered: 13/10/04
Posts: 8327
Loc: MA, USA
Delete in reverse. smile

That's actually how we work out trivia question deleting setup. We delete from the bottom up.

Anyhow, I'll keep both in mind. I don't plan on deleting anything... just getting the top listing, so I'll just leave the first one you gave in the script this time.

Thanks.
_________________________
Invision Support
#Invision on irc.irchighway.net

Top
#135250 - 12/11/05 12:59 PM Re: Sorting Hash Tables
Kelder Offline
Hoopy frood

Registered: 12/04/03
Posts: 701
Loc: Leuven, Belgium
It's the other way around actually :tongue: If you wanted to delete lines 1,2,3 then just delete line1 3 times, if you want to delete lines 1,3,5 then delete lines 1,2,3, if you delete lines 1,3,5, then you have actually deleted lines 1,4,7 smile

Top
#135251 - 13/11/05 12:59 AM Re: Sorting Hash Tables
DaveC Offline
Planetary brain

Registered: 26/09/03
Posts: 4230
My original statment.
like if u wanted to delete lines 1 3 5 of a file, you actually delete lines 1 then 2 then 3, becuase the line numbers change when u delete one.

Quote:
It's the other way around actually :tongue:

ummm no

Quote:
If you wanted to delete lines 1,2,3 then just delete line1 3 times

Not relevant.

Quote:
if you want to delete lines 1,3,5 then delete lines 1,2,3

Just what i said!

Quote:
if you delete lines 1,3,5, then you have actually deleted lines 1,4,7 smile

Yes precisely what i was saying.

Top
#135252 - 13/11/05 10:20 AM Re: Sorting Hash Tables
Kelder Offline
Hoopy frood

Registered: 12/04/03
Posts: 701
Loc: Leuven, Belgium
Oh, well, I read it correctly, just understood it wrong crazy
I reasoned:
"If you wanted to delete 1,3,5 (and thus delete those lines), you actually delete lines 1 then 2 then 3"
where you mean
"like if u wanted to delete lines 1 3 5 of a file, you should actually delete lines 1 then 2 then 3"

Oh well, misunderstandings happen, hopefully everyone understands now. (And if they don't, try stuff out then)

Top