This is a follow-up post to correct part of my earlier post, and to document Blowfish issues for anyone searching the forum later.

There are 2-3 bug-ish behaviors in the current handling of the key or salt|IV parameters.

1. It turns out that key input longer than 56 bytes is not quite the 100% invalid as I had thought.

2. key parameter longer than 56 bytes is silently chopping extra bytes

3. salt|IV parameter is silently turning all codepoints 256+ into the '?' character.

--

1. The issue of whether the max input string is 56 or 72 is caused by conflicting information. A good discussion of this issue is at:


I do agree with the conclusion in that thread, that input could be allowed up to 72 bytes if it's done as an optional switch.

The documents describing the Blowfish design all say it's a 448-bit cipher, and state that not all bits of the final 16-of-72 bytes affect all bits of the encrypted data. A 72-byte key consisting of 56 bytes worth of constants and 16 bytes of actual key material would have lesser strength if the 16 bytes of secret key material were located in bytes 57-72 than if it were anywhere earlier in the key.

The confusion comes from some of the example source codes linked at Schneier's Blowfish page which don't all check for input longer than 56, or even whether it's longer than 72, meaning that someone could use that code to make use of keys longer than 56 bytes. The test vectors linked there give examples only of key lengths of 1-24 bytes, with none consisting entirely of text in the ASCII 33-126 range.

2. Keys now being chopped after byte 56 means that people can use $encode key parameters containing data that doesn't affect the encryption, but the lact of effect is disguised by the default random salt changing the ciphertext each time. These examples always generate identical ciphertext because the changing portion of the key parameter exists beyond position 56:

Code:
//echo -a $encode(testtest,cms,$str(a,55) $+ $rand($chr(192),$chr(255)) ,saltsalt )
//echo -a $encode(testtest,cms,$str(a,56) $+ $rand(a,z) ,saltsalt )
//echo -a $encode(testtest,cms,$str(a,72) $+ $str(b,$rand(0,1)) ,saltsalt )


A solution could be to allow a new switch to be valid only with byte lengths 57-72, and reject a key parameter longer than 56 bytes without using the switch. This should restore support for pre7.52 keys which were longer than 56 bytes, while also alerting people trying to use a key which does not conform to the official Blowfish design.

3. The salt|IV parameter does not UTF-8 encode codepoints 128-255, which is good because that would greatly reduce the number of unique user-defined salt|IV strings that could fit into the 8-byte string. However if codepoints 256+ are used with 's' or 'i', it substitutes them with the 0x3f '?' character, potentially causing duplicate salt|IV strings. I'm not sure what a fix would be, other than rejecting salt|IV strings that would create an IV longer than 8 bytes, allowing the salt|IV parameter to be listed as a hex string or treating salt|IV parameter containing codepoint 256+ as an invalid parameter.
Code:
//var %iv ABCDEFGH | bset -t &v 1 $encode(&v,cmir,key,%iv) | noop $decode(&v,bm) | echo -a $bvar(&v,1-16)

output: 82 97 110 100 111 109 73 86 65 66 67 68 69 70 71 72


By decoding only the mime layer, this reveals that the encrypted string has a header "RandomIV" followed by the IV parameter as bytes 9-16. The next example places codepoints from both ranges 128-255 and 256+ into the IV:
Code:
//var %iv $chr(233) $+ $chr(234) $+ $chr(10004) $+ $chr(10005) $+ ABCD | bset -t &v 1 $encode(&v,cmir,key,%iv) | noop $decode(&v,bm) | echo -a $bvar(&v,1-16) | clipboard $bvar(&v,1-16)

output: 82 97 110 100 111 109 73 86 233 234 63 63 65 66 67 68


The 128-255 characters are correctly placed as a single byte, but all codepoints I've tested in the 256+ range are replaced with the 63 "?" character.

For the above 2 examples, the same thing happens with the user defined salt parameter. When replacing the 'ir' switches with 's', the "RandomIV" header is replaced with "Salted__", and the same bytes appearing in the 9-16 position are used as the salt instead of as the IV.