mIRC Home    About    Download    Register    News    Help

Print Thread
#225266 29/08/10 06:11 PM
Joined: Feb 2003
Posts: 3,412
S
sparta Offline OP
Hoopy frood
OP Offline
Hoopy frood
S
Joined: Feb 2003
Posts: 3,412
I making a seen script, and i noticed that the hash table cosing me som problems with my computer, i get a strange lag when the table grow to 900+ users, and i was thinking of rewrite the script so it using a ini file and storing the users there instead, i know i need to use $readini, and my question is if that would be slower then grab the info from a hash table?

Joined: Aug 2004
Posts: 7,168
R
Hoopy frood
Offline
Hoopy frood
R
Joined: Aug 2004
Posts: 7,168
Any kind of file access will be slower than using the hash table.
However, there are ways of limiting the amount of file access you need to use..
Hash tables can take data directly from a section in an ini file.
See the -i switch under /help /hload and /help /hsave

Also mIRC has file handling routines. /help File Handling

These routines only open & close the file access when the command is given, unlike /write, /writeini, $read, $readini, $ini (and probably others I'm not thinking of) which automatically open & close the file access each time they are executed.
The opening & closing is what takes the most time.

An alternative to moving to an ini file, is to increase the initial size of your hash table. If you are experiencing lag at 900 users, then I suspect that you initialize your table at 100.
Expanding that to 1000 will allow you to work with a much larger user base, but also will not cause a major increase in the amount of RAM required.

The details as to the amount of RAM required will be dependent on your specific system setup.

Last edited by RusselB; 29/08/10 07:50 PM.
Joined: Aug 2010
Posts: 134
T
Vogon poet
Offline
Vogon poet
T
Joined: Aug 2010
Posts: 134
Originally Posted By: RusselB
An alternative to moving to an ini file, is to increase the initial size of your hash table. If you are experiencing lag at 900 users, then I suspect that you initialize your table at 100.
Expanding that to 1000 will allow you to work with a much larger user base, but also will not cause a major increase in the amount of RAM required.


Quote:
/hmake -s <name> <N>
Creates a new hash table with N slots.

A hash table can store an unlimited number of items regardless of the N you choose, however the bigger N is, the faster it will work, depending on the number of items stored.

eg. if you expect that you will be storing 1000 items in the table, a table of N set to 100 is quite sufficient.


You should be using just over the square root of the number of entries you expect using.

Joined: Feb 2006
Posts: 523
J
Fjord artisan
Offline
Fjord artisan
J
Joined: Feb 2006
Posts: 523
Originally Posted By: Thels

You should be using just over the square root of the number of entries you expect using.


this is categorically false in general :P

if we consider a uniform distribution of hashed item values (after all, we are speaking in general terms), it makes little sense to use anything other than a linear function of the expected number of items as your number of slots. otherwise the overall performance of your mIRC hash table is related to the size of the data set and the actual utility of the hash table (its ability to reference data values based on item names in constant time) diminishes. in particular, using your square root hypothesis, a hash table with 1,000,000 items is expected to perform 100 times worse than a hash table with 100 items.


"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
Joined: Aug 2010
Posts: 134
T
Vogon poet
Offline
Vogon poet
T
Joined: Aug 2010
Posts: 134
Hmm, I heard that was the smartest thing to do a while ago.

I just did some reading up on it, and it seems I was misinformed. My apologies.

I'm a little clueless, though. It seems that the larger the table size, the faster code will perform? Then what exactly is the disadvantage of making huge hash tables?

Like, if I expect between 1000 and 2000 entries, would it be wise to make the table size 2000?

Joined: Feb 2006
Posts: 523
J
Fjord artisan
Offline
Fjord artisan
J
Joined: Feb 2006
Posts: 523
Originally Posted By: Thels
I'm a little clueless, though. It seems that the larger the table size, the faster code will perform? Then what exactly is the disadvantage of making huge hash tables?

Like, if I expect between 1000 and 2000 entries, would it be wise to make the table size 2000?


as always, there is a trade off between running time and memory consumption. for 2,000 entries and the recommended table size of 200, we see an average of 10 entries stored in every slot (the slot containing a linked list of sorts). it is not considered all that troublesome to have to traverse a list of this size (or, on average, half this size) to reference an item: other parts of your script might be orders of magnitude slower. this is why the '10 times less' rule is given by the help file. of course, if speed is of paramount importance to you and you have a reasonable understanding of what the table size represents, you can adjust it accordingly.


"The only excuse for making a useless script is that one admires it intensely" - Oscar Wilde
Joined: Oct 2003
Posts: 3,641
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,641
Also realize that for data sets of ~5000 and under (typical mIRC scale) there is basically no real tradeoff to be made on memory. The difference between choosing N=500 and N=5000 is about at most an extra 50kb of RAM.

So really, there's no reason to use the '10 times less' rule on small data sets. The rule is meant to balance the tradeoff between memory and CPU cycles; but where memory is negligible (like 50kb on a 1gb system), there's no reason to make this trade. A hash table with 5000 items will ALWAYS perform better than one with 500, assuming you have at least 500 items in the table.

Joined: Jan 2009
Posts: 116
K
Vogon poet
Offline
Vogon poet
K
Joined: Jan 2009
Posts: 116
I'm wondering how this works with tables with over 10000 (or was it 20000) items then, which is the largest size you can specify in mIRC... or tables that don't have any size specified at all, for that matter, because I have a table with >88000 items in it

Joined: Oct 2004
Posts: 8,061
R
Hoopy frood
Offline
Hoopy frood
R
Joined: Oct 2004
Posts: 8,061
Just as a note, you had the right idea, Thels... just the wrong wording. It's 10x less (or 10% of the number of items), not a square root.

As explained, if you match the items to the size, it will perform better, but will require more RAM. Depending on your system and what you're doing, that increase it RAM could actually make it perform worse. When you get right down to it, setting the size to 10% of the number of items won't be noticeably slower than setting it to the same size as the number of items, but will use less RAM. That's why it's considered a good standard to use.

If you're distributing the script, I personally feel that setting it to 10% is better because you don't know what the other systems can handle and no one will notice any performance difference as long as they aren't benchmarking it (the difference will be in milliseconds). If you're going to only use it for yourself and you have a fast system with a lot of RAM and you don't mind having the hash table use more memory, then go ahead and set the size to equal the number of items or even make it larger than the number of items if you want. Or just always do 10% and call it good. Like I said, you won't notice a difference in performance anyhow and it gets you in the habit of that for when you do decide to distribute a script to others.

@sparta: If you have lag, then either you have the table set to size 1 or something like that or else you have the script set up poorly. Hash tables at that size aren't going to be slow if set up well.


EDIT: Just one last note... the one thing you don't want to do is to set the size of the table to less than 10% of the total number of items. It will get exponentially slower by doing so. Setting it at 1000 when there are 10000 items isn't going to be a noticeable, but setting it to 10 with 10000 items will be noticeable.

Joined: Aug 2010
Posts: 134
T
Vogon poet
Offline
Vogon poet
T
Joined: Aug 2010
Posts: 134
Didn't N default to 100?

Regardless, IRC is running on an 8gig system, so I don't specifically feel the need to be memory conservative :P

EDIT: Good point about handing it out... I've not specified the size so far, as I was going to check later how many options I would be storing, but I'll go with a little over 10% with that.

Last edited by Thels; 30/08/10 01:11 PM.

Link Copied to Clipboard