Yes - this is how hash tables work (hence the term "hash"). An explanation of how hash tables works follows - but see the solution at the end.

A hash table works by creating a number of slots (which you can specify). The item name is fed into a hashing algorithm to select which slot to put an item on. Each slot is then a linked list of items and your new item is inserted at the very beginning of the linked list of the slot that the hash algorithm selects.

When you want to search for the data by item name, mIRC uses the same algorithm to choose the slot and then iterates down the linked list to see if it can find the item you want.

When you do a positional $hget for the first item, mIRC gives you the first item in the linked list on the first slot. But since it is unlikely that the first item you added hashed to the first slot, you won't get that - instead you get the first item whose hash points to the first slot.

The default number of slots if you do not specify it explicitly is 100, which actually results in 101 slots because mIRC always makes the number of slots an odd number because of the way that the hashing algorithm works.

The surprise is that (for 101 slots) that none of the first 788 items hashed to the first slot when you might expect that to have happened a lot sooner.

Now, if you have followed my explanation you might have spotted two things:

1. Performance If you have too few slots for the number of items, then a look up by item name will pick the slot quite fast, but will then have to iterate down the linked list to find the item wanted which could be a lengthy CPU operation if the linked list is a long one.

In general terms, the more slots you have, the shorter the average length of the linked list.

(However you cannot guarantee that the items will be distributed evenly over the available slots. The worst case scenario is that by chance all the item names hash down to the same slot and you get a single very long linked list on that slot and empty linked lists on the other slots. But given a reasonably random set of item names, usually they will be reasonably evenly distributed across slots.)

I have not benchmarked the performance for a random lookup in a hash table with random item names depending on the slot size - but my guess is that the best performance will be when the number of slots is close to the number of items in the table. Khaled may have done some benchmarking when he suggests that you use a number of slots about 1/10 the number of items, but it seems to me that might mean having linked lists between (say) 5 and 20 long - whereas with the number of slots close to the number of items the linked lists should only be (say) 1 to 5 items long. Wikichip suggests that the optimum number of slots is 1.282x the number of entries.

Each slot only takes a very small number of bytes - far smaller than the bytes required to store the items themselves - so in most cases it seems to me that there is little downside to using a relatively large number of slots. (The maximum number of slots is 10,000 (i.e. actually 10,001) which uses 40K of memory - and that is not a huge amount in today's PCs, so I tend not to worry too much about excess slots.)

HOWEVER that applies only to lookup by item name. $hfind also allows you to search by item name using the item name as a wildcard match to text, or as wildcard text matching the item name, or using the item name as a regex to match to text or as regex text to match the item name. Since there seems to be no way to use a hash determine which slot the item might be on, it seems to me that it is likely that mIRC has to iterate across all linked lists on all slots to complete the find, in which case the performance is unlikely to be effected by the number of slots. Ditto when you do any of the above searches on the data rather than the item.

So for $hfind which iterates across all entries regardless of number of slots, you might as well use 1 slot.

But if you are using $hget to do a hash lookup based on a specific item, then you should use as large a number of slots as you think you need.

2. If you have digested all of the above then you may have realised that you can make a hash table store items sequentially so that you can iterate over them as the original poster would like to by using -m1 to create a single slot hash table and so a single linked list. Since each item is added to the beginning of the list, the sequence they are added is preserved (but reversed) when you iterate using $hget(table,n) in other words you can either store them in reverse order or access them in reverse order:

Code:
//hfree TEST | //var %i = 1 | while (%i <= 1000) { hadd -m1 TEST test_ $+ %i 1 | inc %i } | echo -a Item 1: $hget(TEST,1).item Item 1000: $hget(TEST,1000).item
//hfree TEST | //var %i = 1000 | while (%i) { hadd -m1 TEST test_ $+ %i 1 | dec %i } | echo -a Item 1: $hget(TEST,1).item Item 1000: $hget(TEST,1000).item

Last edited by Protopia; 13/10/18 08:18 PM.