Thanks for the reply, Protopia. A few points about yours and Khaled's posts.

1. Some readers may be confused because you use "slot" to have the same meaning as /help uses "buckets". Just like other references may use the 'key' term in the way /help uses 'item'.

2. Yes you can place items in forward order by /hadd'ing them in reverse order, which was what my example showed. But that's only for the initial creation, and as my examples above showed, as you /hsave and /hload the table, the item order gets flipped. So you either need to do the /hsave + /hload twice, or have a temp table you use to un-flip the order.

3. Khaled posted:


Items in hash tables are stored in a completely random order.

They are shuffled into buckets in a way that appears random, especially if the item names are not similar to each other. However they're not truly random because the same combination of item names and buckets always results in them being in the same sequence.

//hfree -sw test | hmake -s test 100 | var %i 1 , %a | while (%i isnum 1-999) { hadd test item $+ $base(%i,10,10,3) data | inc %i } | var %n 1 | while ($hget(test,%n).item) { var %a $sha1(%a $v1) | inc %n } | echo -a hash of item sequence %a

This creates a unique sha1 hash based on the sequence the 999 items are distributed into buckets/linked-chains. Repeating the same command always results in the identical sort-order. I'm sure it's because it's best to have the number of buckets and the number of items be relatively prime to each other, and being an odd number is deemed 'close-enough'. The default 100 behaving like N=101 buckets, which is a prime number. The max 10000 behaves like 10001 which is not prime, as the largest prime <= 10001 is 9973.

This alias also shows that buckets=even_number and buckets=even_number+1 always produce the same order. There's precedence for the order to change depending on the version, as the order in 6.35 is different than it currently is.

The allocation of items to buckets doesn't appear to be a sophisticated algorithm, or even something simple like $crc(). It's possible to have combinations of item-count and buckets which results in items not appearing to be very well shuffled at all, which you wouldn't expect from something that's trying to pseudo-randomly assign items to buckets. For example, for most of this list it simply swaps an even item with even+1:

//clear | hfree -sw test | hmake -s test 101 | var %i 20 | while (%i) { hadd test item $+ %i data | dec %i } | var %N 1 | while ($hget(test,%N).item) { echo 4 -a $ord(%N) item is $hget(test,%N).item | inc %N }

Based on the name of the identifier, I'm wondering if $hash() is used when assigning buckets to each item, since it's very prone to having identical 24-bit hashes to similar strings.


No switch can be added to make them save in any particular order.

Having a switch for /hload which begins with the last pair of lines in the file then works its way forward would simulate such a switch, though I suppose there might be a performance hit for doing so. Either that, or people would be forced to script their own fake /hload alias which would use $read() and /hadd to load the disk file into the same order in which the items were /hsave'ed.