|
|
|
Joined: Jan 2004
Posts: 2,127
Hoopy frood
|
OP
Hoopy frood
Joined: Jan 2004
Posts: 2,127 |
The salt *parameter* is a text string, yes. Rather than having parameters be &binvars, my list of improvements had previously suggested a switch where the key and salt/iv parameters both be seen as hex instead of UTF8 text. I was not asking here for it to accept a &binvar parameter, I'm referring to how the 64-bit salt value seen in the string's header at byte positions 9-16 should be handled internally as that same 64-bit value.
Using the 's' switch is just overriding the effect of how the 64bit salt is randomly created, and using a salt parameter shorter than 8 is no different than a random salt that happened to generate 0x00 for the 8th byte. $decode can always decrypt without using the 's' switch because the 64bit salt created by the 's' salt-parameter is stored there, and $decode has no way of knowing whether that salt was generated randomly or by user input. If 1 or more of the final bytes of the 64bit salt is the 0x00 byte, then it should still be part of the 8-byte 64-bit value, the same as the 0x00 would be if it were randomly generated.
The random salt has always been created as a binary value, where each of the 8 bytes has a 1/256 chance of being a 0x00 byte. 1 out of every 32.4 64-bit random values has at least one 0x00 byte in it, and they're being hashed differently than other programs would expect. As I've seen described at the link I pasted, the method of combining the salt and password involves a hash function where the password can be of variable length, but the salt is a fixed length of 8. By not including the trailing 0x00's padded to the end of user-input salts, it creates incompatible hashes due to a not=8 length input being combined with the passphrase as input to a hash function which generates the IV and the salted-key.
In addition to being incompatible, this behavior of chopping at the 1st 0x00 byte is happening for the default generating of random 64-bit salts, and there it's causing many messages with unique 64bit salts and the same key parameter to be encrypted identically, only differing by having the 'unique' salt stored in the encrypted string's header. In addition to being incompatible with the test vectors, this is contrary to the intent of what a random salt should be doing.
By having a 64-bit salt combined with the key, it's supposed to allow somewhere in the neighborhood of 2^64 different salted-keys being generated from the same passphrase key+salt. 1 out of every 256 random salts has the 1st byte of the 64bit salt randomly generated as the 0x00 byte. By truncating the random salt at the first occurrence of the 0x00 byte, this group of random salts has 2^56 members in it, and they're all hashed by combining the passphrase with the $null string instead of combining the passphrase with the 8-byte value shown in the header as bytes 9-16. The same thing is happening to other groups of random salts which have 2^48 members, 2^40 members, etc. Instead of the 'birthday paradox' causing a random 64bit salt to have a 50% chance of being duplicated in 4 billion messages, the example showed truncated duplicates were happening over 100 times in 25k messages.
At first glance, a lot of these strings appear different, but that's only because of containing a unique salt in their header, which sometimes is mostly ignored. In this example, the "+++++++++" is where most of the truncated 64bit salt is stored in the header. The +'s can be replaced with any other mime character and it has no effect on the decryption.
//echo -a $decode(U2FsdGVkX18A+++++++++0rIX3dSCYYa216ecXj5pkL9ki5Fa+iJR2jmd2mPUIjP,mc,key)
I'm anticipating the salt parameter should be handled the way the IV parameter and the randomly generated IV's are being handled internally.
$encode(message,mcir,key,iv) vs $encode(message,mcs ,key,salt)
The IV parameter shorter than 8 is being padded to length 8 with 0x00's and stored in the header, the same as done with the short salt parameter. By allowing shorter than length 8 IV parameters, this allows additional IV's to be created which otherwise could not be created from a text string. When the 1st byte of a random generated IV is 0x00 and is followed by other non-0x00 bytes, it's not being used as if the IV were entirely 0x00's.
Other programs seeing 0x00's in the 64bit salt, regardless how they were generated randomly or by user input, would be generating completely different salted-keys and IV out of hashing the passphrase and salt together. If there is a reason that $encode and $decode can't internally handle 64bit salts containing the 0x00 byte, then $encode can retain compatibility with the test vectors by not creating the 3.1% of random 64bit values that contain the 0x00 byte, using the method in my 'randsalt' alias where it generates 8 different random numbers from the range 1-255. If 0x00's would no longer be allowed in the 64bit salt, this also would mean the salt parameter would need to go back to requiring the length of the salt being exactly 8 bytes. However this would still retain incompatibility with how other programs would occasionally be generating random 64bit salts containing 0x00's.
|
|
|
|
Joined: Dec 2002
Posts: 5,493
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 5,493 |
Ok, thanks for the explanation. To summarize: the salt should always be zero-padded to eight bytes.
When I make this change in the key derivation function, it passes the tests in your scripts.
That said, I am guessing this will break backwards compatibility. But if I understand you correctly, this will make it compatible with the standard implementation.
This change will be in the next beta.
|
|
|
|
Joined: Jan 2004
Posts: 2,127
Hoopy frood
|
OP
Hoopy frood
Joined: Jan 2004
Posts: 2,127 |
To summarize: the salt should always be zero-padded to eight bytes. This summary is true for user-defined salts created using the 's' switch, because user-defined text cannot create salt strings with embedded 0x00 bytes followed by other not-0x00 bytes. To include the randomly created salts, more accurate would be that the salt should always use the entire 8 byte salt at bytes 9-16 inside the mime string - which the 's' switch matching a salt-parameter shorter than length 8 are being padded to 8 with 0x00's, and salt/iv parameters longer than 8 bytes are being silently chopped to 8. That said, I am guessing this will break backwards compatibility. But if I understand you correctly, this will make it compatible with the standard implementation.
True. There are 3 incompatibilities introduced in the recent posts of this thread. For anyone encountering incompatibilities, these workarounds should solve most cases: 1. For syntax which does not use the 'l' switch for a literal key, the input to the hashing algorithm is no longer chopping the key parameter at 56 UTF-8 encoded bytes. This alias should allow most older long text passwords to be used after these fixes.
alias chop_key_to_56 {
noop $regsubex(foo,$1,,,&maroon.tmp)
returnex $bvar(&maroon.tmp,1-56).text
}
$decode(mime_string,mc,$chop_key_to_56(key_parameter) ) $decode(mime_string,mr,$chop_key_to_56(key_parameter) ) $decode(mime_string,mi,$chop_key_to_56(key_parameter),user_iv ) The exception would be any keys where any codepoint 256+ is partly within/beyond the hash algorithm's former chop limit.
//echo -a $chop_key_to_56($str(a,55) $+ $chr(233) )
In this case, 1 of the 2 encoding bytes was used and the other was ignored. Since this used an invalid UTF8 string as the key, there's no workaround other than allowing key/salt/iv to be hex/binvar. 2. The fix where codepoints 256+ in a salt/iv parameter were always used as the '?' character is easy to fix. If this were a user-defined salt created with the 's' switch, the '?' is already in the mimestring's header, so the file can be decrypted as normal without the 's' switch. Same applies to user-defined IV placed into the mime's header using 'mcir' or 'mcirl'. When using 'i' without using 'r', the IV is not placed into the header, so would need to be decoded by substituting the '?' in place of the codepoint 256+'s in the 'i' switch's IV parameter. 3. When the Salt was zero-truncated shorter than 8 bytes before being used, the mime string can be modified to move bytes from the key parameter into the salt, which should allow the new $decode behavior to arrive at the same salted-key+IV used to encrypt the file. There's 2 exceptions which would require hex keys to fix: a. Depending on where the 0x00 first appears in the salt string, up to 8 bytes might need to be moved from the key parameter to the salt. If the key was not long enough, then the following alias won't be able to cannibalize the key parameter to create the 8 byte salt that's required. My example in post 265381 used the 3 byte string 'key' as the key parameter when creating these example strings, so the only mime strings which could be fixed for the new decoding behavior would be those where the 0x00 was not in the first 8 bytes of the salt. b. The key parameter is UTF-8 encoded, but the salt is not, so - depending on the location of codepoints 256+ near the end of the key parameter - the key cannibalizing might not be able to create a valid 8-byte salt and a valid UTF-8 text string at the same time. This alias should fix mime strings encountering issue#3 except for the 2 exceptions noted above:
alias upgrade_salt {
echo -a syntax: //noop $ $+ upgrade_salt(old_mime_string,key_parameter)
var %pattern /^[0-9a-zA-Z/+]+={0,2}/g
if (($len($1) !isnum 32-) || (!$regex($1,%pattern))) { echo -a invalid mime string $1 | return }
bset -t &maroon.tmp 1 $1 | noop $decode(&maroon.tmp,bm)
echo -a $bvar2hex(&maroon.tmp,1-16) $bvar(&maroon.tmp,1-16).text
if ($bvar(&maroon.tmp,1-8).text !=== Salted__) { echo -a this was not created with a random/user salt | return }
if (!$istok($bvar(&maroon.tmp,9-16),0,32)) { echo -a salt does not contant 0x00 and doesn't need to be fixed | return }
var -p %key $2- , %i 9 , %salt_used , %keylen $len(%key)
while (%i isnum 9-16) {
if ($bvar(&maroon.tmp,%i) > 0) var -s %salt_used %salt_used $v1 | else var %i 16 | inc %i
}
while ($numtok(%salt_used,32) < 8) {
var -s %byte $asc($mid(%key,%keylen,1))
noop $regsubex(foo,$mid(%key,%keylen,1),,,&maroon.moved.char)
dec -s %keylen | var -s %salt_used $bvar(&maroon.moved.char,1-) %salt_used
if ($numtok(%salt_used,32) > 8) { echo -a unable to fix due to UTF8 char %a in key | halt }
if (($numtok(%salt_used,32) < 8) && (%keylen == 0)) { echo -a unable to fix due to too-short key | halt }
}
bset &maroon.tmp 9 %salt_used | noop $encode(&maroon.tmp,bm)
noop $regsubex(foo,$left(%key,%keylen),,,&maroon.tmp.key)
echo -a try to decode with: //echo -a $ $+ decode( $bvar(&maroon.tmp,1-).text ,mc, $left(%key,%keylen) )
echo -a if contents of mime are binary you may need to load this mime string into a binvar and decode that
echo -a characters in the key: $bvar(&maroon.tmp.key,1-)
}
If the shortened key contains a trailing space character, you might need to use $+ $chr(32) to re-create the key string shown in the last line of the alias's display. In the earlier post's example where "//echo -a $encode(message,mcs,key1234,5678)" was creating a key using only the 4 byte salt '5678' instead of also including the 4 0x00's in the mime header, the 7.55-and-earlier output is: U2FsdGVkX181Njc4AAAAAOZf+ZmPFy3l To repair this so the new $decode behavior can decrypt it, the mime must be altered to move 4 bytes from the key into the salt: //noop $upgrade_salt(U2FsdGVkX181Njc4AAAAAOZf+ZmPFy3l,key1234) The alias modifies the salt string inside the mime and recommends trying to decode with a key shortened from 'key1234' to 'key' : //echo -a $decode( U2FsdGVkX18xMjM0NTY3OOZf+ZmPFy3l ,mc, key ) Which should work after the fix.
|
|
|
|
Joined: Jan 2004
Posts: 2,127
Hoopy frood
|
OP
Hoopy frood
Joined: Jan 2004
Posts: 2,127 |
The above 7.55 beta changes are looking good now for CBC. For ECB mode I noticed that 'er' and 'ei' should probably be invalid switch combos like 'es' is, instead of being silently ignored. Also, for documenting the ECB behavior for future searchers, key lengths 1-56 remain identical with/without the 'l' switch. While 'el' now invalidates keys longer than 56 bytes, dropping the 'l' switch now reverts to pre v7.52 behavior of silently chopping ECB keys at 72 bytes, as shown by the following example with/without the 'l' switch. If wanting to preserve support for the non-l literal 57-72 byte ECB keys, ECB mode without the 'l' switch should at least be reporting an error if the key parm is longer than 72, since the non-literal ECB key isn't an unlimited string being filtered through MD5.
//var %i 55 | while (%i isnum 1-73) { echo -a %i $encode(message,em,$left($str(abcde,20),%i)) | inc %i }
|
|
|
|
Joined: Dec 2002
Posts: 5,493
Hoopy frood
|
Hoopy frood
Joined: Dec 2002
Posts: 5,493 |
Thanks these checks have been added to the next version.
|
|
|
|
|
|
|
|