The factor that affects performance the most is the
position of a variable in Variables. Referencing a variable that happens to be 10th in the list of variables is very fast, however this isn't the case when that var is the 3000th in the list. This is also true for setting a var: if Variables has a cpl of thousands of items, changing the value of the 10th variable is fast but doing the same to the 3000th var or, even worse, setting a new var (which is then appended to the end of the list) is much slower. None of this is true for hash tables: the number of items in it affects neither /hadd nor $hget() (but it affects $hfind() and $hget().
item, as starbucks_mafia hinted in a previous post).
I have benchmarked all these things, but I didn't write separate code for each case I examined, I just modified the code for the next measurement each time. So, I cannot post a full benchmark that proves it (too lazy to do that), but I can post the 4 basic aliases I used and point out which parts I changed. By passing the appropriate numbers to them, one can test vars vs hash tables for any number of items.
alias varset {
unset %test_*
var %i = $$1, %a = $str(b,$$2), %t = $ticks
while %i { set %test_ $+ [color:red]%i[/color] %a | dec %i }
echo -a VarSet $1-: $calc($ticks - %t)
}
alias varget {
var %i = $$1, %t = $ticks
while %i { !.echo -q %test_ [ $+ [ [color:red]%i[/color] ] ] | dec %i }
echo -a VarGet $1-: $calc($ticks - %t)
}
alias hshset {
hfree -w test | hmake test
var %i = $$1, %a = $str(a,$$2), %t = $ticks
while %i { hadd test _ $+ %i %a | dec %i }
echo -a HshSet $1-: $calc($ticks - %t)
}
alias hshget {
var %i = $$1, %t = $ticks
while %i { !.echo -q $hget(test,_ $+ %i) | dec %i }
echo -a HshGet $1-: $calc($ticks - %t)
}
The syntax is
/varset <iterations> <length of data>
/varget <iterations>
<length of data> was included in case this makes any significant difference; it doesn't appear to.
I ran these aliases quite a few times with different arguments, changing the red parts (%i) to a specific number each time. After running
/varset 3000 10
one can easily notice that changing "%i" to "3000" makes any subsequent /varget and /varset calls fast, but changing it to "1" slows them down a lot. That's because %test_1 is at the bottom of the vars list, while %test_3000 is at the top (assuming you don't have other vars in Variables).
The conclusion is more or less the same as yours:
- the more the items you're going to use, the better hash tables are for the job
- this difference becomes significant with >1000 items (or perhaps >500, in slower systems)