$sha* larger disk buffer speed boost

Print Thread

$sha* larger disk buffer speed boost #265754 12/07/19 10:29 PM
Joined: Jan 2004 Posts: 2,081 M maroon OP Hoopy frood
OP maroon Hoopy frood M Joined: Jan 2004 Posts: 2,081	Quote 28.Changed $sha1/$sha256/$sha384/$sha512/$hmac() to use larger read buffer with files to improve speed. I'm seeing a noticeable boost in speed for the listed hashes, but it wasn't clear whether $crc and $md5 were already considered to be optimized, or just not a priority. I'm curious how large is the SHA* disk buffer now? From Saturn's sha2.dll which had a 4kb disk buffer, I was finding a 10% speed gain from increasing that to 16kb, but the disk speeds here increased more than that. The attached alias is slow, because I used an 8gb filesize trying to avoid the OS disk cache favoring the 2nd disk read. The alias compares the speed of CRC vs MD5 when reading from disk, string vs binvar of same 8kb length, and a longer binvar so that overhead is a smaller percent of the time. /compare_crc_md5 2 The '2' changes the order in which the huge disk file is read, otherwise it calculates CRC first. I had to REM the benchmark of $hash because it was extremely slow compared to all the other hashes. Hashing an 8kb string takes approx 31 ticks, so the 100k repetitions would have taken nearly an hour. Since CRC is a much simpler algorithm than MD5, I expected that CRC would be significantly faster than MD5. However, the time difference between MD5 and CRC when handling binvar's or text was closer than I expected, and regardless which order I use MD5 or CRC against the 8gb file, when reading from disk CRC is always slower than MD5, though by a small percentage. From the benchmark showing CRC slower from disk but very slightly faster from string/binvar, it looks like CRC is using a smaller disk buffer than MD5 does, but also that either CRC is not as optimized as it could be, or that MD5 was coded in assembler compared to CRC being coded in a higher-level language. The speed difference of hashing 8kb text vs 8kb binvar I'm assuming is because the binvar doesn't need to keep re-allocating the memory which the text is doing. I was expecting the much more complex MD5 to be more than twice the time of CRC, but the difference was only around 5% - when comparing the time from hashing the long binvars where there was much less overhead than in other tests. From looking at mirc.exe in a hex viewer, I was able to find the 1024 bytes of the CRC lookup table, so the cause isn't from $crc using the slower no-lookup algorithm. (I did see the lookup table duplicated twice, so there might be some duplicate code that could be removed.) SHA* hashes haven't made CRC and MD5 go away. CRC is used to verify integrity of file transfers or quickly comparing 2 files of the same size, and MD5 is sometimes done as a faster-than-SHA1 method of doing the same, when the chance of a CRC collision is considered too high. This is the benchmark where I found $hash being about 300x slower than $crc for text strings. Unless someone needs compatibility with $hash, they should either use the crchash alias, or use up to 13 hex digits from one of the crypto hashes for up to 52 bits of a hash. Code //var %string $str(a,8192), %reps 1000, %t $ticks \| while (%reps) { noop $crc(%string,0) \| dec %reps } \| echo -a ticks: $calc($ticks - %t) //var %string $str(a,8192), %reps 1000, %t $ticks \| while (%reps) { noop $hash(%string,32) \| dec %reps } \| echo -a ticks: $calc($ticks - %t) alias crchash { return $calc( $base($crc($1,0),16,10) % (2^$iif($$2 isnum 1-32,$gettok($2,1,46),32)) ) } This is the slow alias where each of the CRC or MD5 disk reads takes over 1 minute. And be sure to delete the 8 gigs disk file when done. Code alias compare_crc_md5 { if ($disk($mircdir).free < $calc(1.1*2^33)) { echo -a insufficient free disk space \| return } tokenize 32 $iif($1 == 2,md5 crc,crc md5) bset &string 1 0 \| bwrite 8gig.dat $calc(2^33-1) 1 &string var -s %t $ticks , %a $ $+ $1 $+ (8gig.dat,2) echo -a disk $1 $eval(%a,2) ticks: $calc($ticks - %t) var -s %t $ticks , %a $ $+ $2 $+ (8gig.dat,2) echo -a disk $2 $eval(%a,2) ticks: $calc($ticks - %t) var -s %len 8192 , %reps 100000 bset &string %len 0 \| var %string $str(a,%len) \| bset &longstring 99999999 0 var %i %reps, %t $ticks \| while (%i) { noop $crc(%string,0) \| dec %i } \| echo -a crc. text $calc($ticks - %t) var %i %reps, %t $ticks \| while (%i) { noop $md5(%string,0) \| dec %i } \| echo -a md5. text $calc($ticks - %t) var %i %reps, %t $ticks \| while (%i) { noop $sha1(%string,0) \| dec %i } \| echo -a sha1 text $calc($ticks - %t) ; var %i %reps, %t $ticks \| while (%i) { noop $hash(%string,32) \| dec %i } \| echo -a hash text $calc($ticks - %t) var %i %reps, %t $ticks \| while (%i) { noop $crc(&string,1) \| dec %i } \| echo -a crc. bvar $calc($ticks - %t) var %i %reps, %t $ticks \| while (%i) { noop $md5(&string,1) \| dec %i } \| echo -a md5. bvar $calc($ticks - %t) var %i %reps, %t $ticks \| while (%i) { noop $sha1(&string,1) \| dec %i } \| echo -a sha1 bvar $calc($ticks - %t) var -s %reps 10 var %i %reps, %t $ticks \| while (%i) { noop $crc(&longstring,1) \| dec %i } \| echo -a crc. bvar $calc($ticks - %t) var %i %reps, %t $ticks \| while (%i) { noop $md5(&longstring,1) \| dec %i } \| echo -a md5. bvar $calc($ticks - %t) var %i %reps, %t $ticks \| while (%i) { noop $sha1(&longstring,1) \| dec %i } \| echo -a sha1 bvar $calc($ticks - %t) }

Re: $sha* larger disk buffer speed boost maroon #265755 13/07/19 07:26 AM
Joined: Dec 2002 Posts: 3,845 London, UK Khaled Hoopy frood
Khaled Hoopy frood Joined: Dec 2002 Posts: 3,845 London, UK	Thanks for testing these out. Quote I'm seeing a noticeable boost in speed for the listed hashes, but it wasn't clear whether $crc and $md5 were already considered to be optimized, or just not a priority. $md5() was already using the larger buffer. $crc() is actually a much older implementation located elsewhere in the source code which I didn't look at. I have increased its buffer size as well in the next beta.

Re: $sha* larger disk buffer speed boost Khaled #265792 24/07/19 06:28 AM
Joined: Jan 2004 Posts: 2,081 M maroon OP Hoopy frood
OP maroon Hoopy frood M Joined: Jan 2004 Posts: 2,081	I'm not sure why $crc should be very close to $md5's speed, because everything I read gives the idea that CRC32 should be 3-5x faster than MD5. Since I find 2 copies of the bytes of the crc-32 lookup table in a hex viewer, it shouldn't be caused by $crc using the slower method that doesn't use the lookup table. Only other reason I can think for $crc being slower would be a routine that's called for each byte instead of called once for a long string. It seems the benchmark comparisons between the speed of hashes depends on which pc is running them. The 1st one is from my PC, and the 2nd is a slower pc running win10. crc: 1263 binvar size: 536870912 md5: 1326 is 104.988124% sha1: 2449 is 193.903405% sha256: 6911 is 547.189232% sha384: 11217 is 888.123515% sha512: 10982 is 869.517023% crc: 1547 binvar size: 536870912 md5: 1766 is 114.156432% sha1: 2203 is 142.404654% sha256: 4109 is 265.61086% sha384: 3469 is 224.240465% sha512: 3484 is 225.210084% The thing that's strange is how the SHA* all got faster. SHA512 is supposed to be slower than SHA256, but the win10 pc was actually running This alias tries to eliminate as many types of overhead as possible, beyond just hashing the data. I had it hash the binvar once before timing it, in case there was a time delay in allocating the binvar in the first place. By using a very long binvar, I hoped to avoid time delays caused by repeatedly allocating string space, or lag from reading files from disk. Code alias hash_speeds { bset &v $calc(2^$iif($1 isnum 1-,$1,29)) 1 \| noop $crc(&v,1) var %t $ticks \| noop $crc(&v,1) \| var %crc $calc($ticks - %t) \| echo -a crc: %crc binvar size: $bvar(&v,0) var %t $ticks \| noop $md5(&v,1) \| var %md5 $calc($ticks - %t) \| echo -a md5: %md5 is $calc(100 * %md5 /%crc) $+ % var %t $ticks \| noop $sha1(&v,1) \| var %sha1 $calc($ticks - %t) \| echo -a sha1: %sha1 is $calc(100 * %sha1 /%crc) $+ % var %t $ticks \| noop $sha256(&v,1) \| var %sha256 $calc($ticks - %t) \| echo -a sha256: %sha256 is $calc(100 * %sha256 /%crc) $+ % var %t $ticks \| noop $sha384(&v,1) \| var %sha384 $calc($ticks - %t) \| echo -a sha384: %sha384 is $calc(100 * %sha384 /%crc) $+ % var %t $ticks \| noop $sha512(&v,1) \| var %sha512 $calc($ticks - %t) \| echo -a sha512: %sha512 is $calc(100 * %sha512 /%crc) $+ % }

Re: $sha* larger disk buffer speed boost maroon #265793 24/07/19 07:47 AM
Joined: Dec 2002 Posts: 3,845 London, UK Khaled Hoopy frood
Khaled Hoopy frood Joined: Dec 2002 Posts: 3,845 London, UK	It all comes down to the implementation used and how optimized it is. In some cases, mIRC is using an internal implementation, in other cases it is using a Windows API or external lilbrary. As for SHA384 and SHA512, these are both using the same Windows API, so it will come down to the way Windows has implemented/optimized these. It may even be related to 32bit vs 64bit, where one implementation takes advantage of 64bit word-sizes. That said, I don't have plans to look into these further, as they are not a priority. Thanks for testing them out though.

Re: $sha* larger disk buffer speed boost

Khaled #265795 24/07/19 09:16 AM

Joined: Feb 2003

Posts: 2,737

Raccoon

Hoopy frood

Raccoon

Hoopy frood

Joined: Feb 2003

Posts: 2,737

Hey Khaled,

What's going on behind the scenes here to give these benchmarks that seem counter-intuitive.

Each benchmark performs 100,000 $crc(Raccoon,0) calls. The [ [ $str(...,N) ] ] are pre-evaluated before the while-loop begins.

Code

Codepasta:
//var %n = 100000, %t = $ticks | while (%n) { noop $crc(Raccoon,0) | dec %n } | echo -ai * Bench 100,000 ....... $calc($ticks - %t) ms.
//var %n = 10000, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),10) ] ] | dec %n } | echo -ai * Bench 10,000 x 10 ... $calc($ticks - %t) ms.
//var %n = 5000, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),20) ] ] | dec %n } | echo -ai * Bench 5,000 x 20 .... $calc($ticks - %t) ms.
//var %n = 2500, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),40) ] ] | dec %n } | echo -ai * Bench 2,500 x 40 .... $calc($ticks - %t) ms.
//var %n = 1000, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),100) ] ] | dec %n } | echo -ai * Bench 1000 x 100 .... $calc($ticks - %t) ms.
//var %n = 500, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),200) ] ] | dec %n } | echo -ai * Bench 500 x 200 ..... $calc($ticks - %t) ms.
//var %n = 250, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),400) ] ] | dec %n } | echo -ai * Bench 250 x 400 ..... $calc($ticks - %t) ms.
//var %n = 200, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),500) ] ] | dec %n } | echo -ai * Bench 200 x 500 ..... $calc($ticks - %t) ms.
//var %n = 100, %t = $ticks | while (%n) { noop [ [ $str($!crc(Raccoon,0) $+ $chr(32),1000) ] ] | dec %n } | echo -ai * Bench 200 x 500 XXX $calc($ticks - %t) ms.

Results:
* Bench 100,000 ....... 3245 ms.  (While loop takes the longest...)
* Bench 10,000 x 10 ... 1311 ms.
* Bench 5,000 x 20 .... 1092 ms.
* Bench 2,500 x 40 .... 983 ms.
* Bench 1000 x 100 .... 967 ms. <---
* Bench 500 x 200 ..... 1030 ms.  (Now long lines take longer???)
* Bench 250 x 400 ..... 1201 ms.
* Bench 200 x 500 ..... 1280 ms.
* Line too long: $str (expected)

Just curious. Side topic but related.

Last edited by Raccoon; 24/07/19 09:21 AM.

Well. At least I won lunch.
Good philosophy, see good in bad, I like!

Link Copied to Clipboard

Forums Developers $sha* larger disk buffer speed boost