mIRC Home    About    Download    Register    News    Help

Print Thread
Page 1 of 2 1 2
Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
I'm assuming the major goal for $encode is to be compatible with OpenSSL, so 'e' and 'cl' should handle code points 128+ or too-long key parameters the same way OpenSSL does (or should). $encode violates the Blowfish cipher's design requirements in hopefully rare cases:

* when using 'e' with $len(key) or $len($utfencode(key)) greater than 56. There's no issue with shorter keys containing codepoints 128+.

* Using 'el' or 'cl' where the key string contains characters in the 128+ range which are UTF-8 encoded into multiple bytes each. It rejects $len() strings shorter than 56 having the correct $utfencode() length of 56 bytes, but accepts $len() 56 strings incorrectly having more than 56 UTF-8 bytes.

If there's a decision to maintain backwards compatibility for people who have usages where they used too-long keys, perhaps adding a status window warning advising them to change this weak key.

-

The Blowfish design of 16-round Blowfish uses an expanded password string of 72 bytes, which it fills by allowing the input string to have no more than 56 bytes, then replicates that string as many times as needed until it contains 72 bytes. This requires at least 16 bytes of the 72-byte expanded key be repeats, making it difficult for an attack where someone creates completely different 72-byte expanded passwords which have matching 'sub-keys' within the encryption cipher.

When 'e' 'el' or 'cl' sends literal keys to Blowfish, they're using $utfencode(key) as the replicated byte pattern, as they should. But they're checking $len(key) instead of $len($utfencode(key)) to determine whether the string has the correct number of bytes.

Assuming mIRC is otherwise compatible with OpenSSL, its current behavior is equivalent to "bset -t &key 1 key-parameter" without the 'a' switch. The only changes needed to fix the length issue are:

* using $bvar(&key,1-56) instead of $bvar(&key,1-72) as the byte pattern expanded to 72 bytes. This should apply only to 'e' or 'cl' since hashing of long non-l 'c' strings doesn't violate Blowfish's design because they use 'hash(key+other)' instead of 'key'.
* the 'l' switch needs to check $bvar(&key,0) instead of $len(key-parameter) when verifying the length is 56.
* decide for compatibility purposes how to handle currently permitted illegal passwords whose $len($utfencode(key)) lengths are greater than 56.

The part below is just demonstrating how I identified that too-long UTF-8 encoded byte strings were being used.

--

Virtually all the test vectors for Blowfish have keys having bytes which get expanded by UTF-8 into bytes pairs, so those test vectors can't be replicated in mIRC without using a binary variable as the key, but that's a separate feature request. Here's an early Blowfish article which shows a test vector using only 7-bit text for both plaintext and key:
Quote:

http://www.drdobbs.com/the-blowfish-encryption-algorithm-one-ye/184409634

************** TEST VECTORS ***********************************
This is a test vector.
Plaintext is "BLOWFISH".
The key is "abcdefghijklmnopqrstuvwxyz".

#define PL 0x424c4f57l
#define PR 0x46495348l
#define CL 0x324ed0fel
#define CR 0xf413a203l
static char keey[]="abcdefghijklmnopqrstuvwxyz";

1) After loading the following code, using '/BFtest A 26' recreates the ciphertext defined in the article, where the plaintext is the 8-byte string BLOWFISH and the key is the 26-byte alphabet. The alias converts the ciphertext display of $bvar(&var,1-) from the 1-3 digit decimals to 2-digit hex, making it easier to match the vector listed in hex.

Code:
alias bvar2hex { var %h $1- | var %i $numtok(%h,32) | while (%i) { var %h $puttok(%h,$base($gettok(%h,%i,32),10,16,2),%i,32) | dec %i } | return %h }
alias BFtest {
  echo -a ---- /BFtest <a|b|c> [Length] | var %switches bm $+ $iif($1 isin c,cri,e) $+ $iif($1 isin bc,l) | bset -t &var 1 BLOWFISH123 | ; the 123 avoids length-8-binary bug
  var %IVchar 7 | var %ChopLength $iif(($2 isnum 1-) && (l !isincs %switches),$int($2),56)
  if ($1 isin c) { var %i 8 | while (%i) { bset &var %i $xor($bvar(&var,%i),%IVchar) | dec %i } }
  if ($1 isin a)  var %key $str(abcdefghijklmnopqrstuvwxyz,3)
  if ($1 isin bc) var %key $str(abc $+ $chr(233) $+ $chr(233),19) | var %key $left(%key,%ChopLength)
  echo switches: %switches keylen: $len(%key) utflen: $len($utfencode(%key)) key: %key plaintext hex/text: $bvar2hex($bvar(&var,1-8)) $bvar(&var,1-8).text
  if ($1 isin c) noop $encode(&var,%switches,%key,$str($chr(%IVchar),8))
  else           noop $encode(&var,%switches,%key)
  noop                $decode(&var,bm) | var %range 1-8 | if ($1 isin c) var %range 17-24
  echo $chr(3) $+ 0,4ciphertext range %range $+ : $bvar2hex($bvar(&var,%range)) / $bvar(&var,1-).text
  if ($1 isin abc) { bset -t &binkey 1 %key | echo pattern replicated to fill 72-byte expanded pass: $bvar(&binkey,1-72) $iif($bvar(&binkey,0) isnum 57-,$chr(22) bytes at the end that should not be used: $bvar(&binkey,57-72)) }
}



2) Using '/BFtest A 57' shows the ciphertext output where the key is the alphabet repeated until it's an invalid key length of 57. It expands to the 72-byte expanded password in a way that can't be created from the correct method of expanding a length 1-56 string to 72 bytes. I was able to duplicate mIRC's encoded output only by altering a Blowfish utility to allow the max input key length be greater than 56.

This issue should only affect switch 'e' and 'cl', because 'c' without 'l' hashes the key string to fewer than 56 bytes, regardless of the length of the key parameter.

3) As I understand OpenSSL's stated behavior, it converts keyboard input to UTF-8 bytes, and it looks like mIRC does that with the 'key' parameter when using the 'e' switch, and 'cl' does the same thing. However, if this string contains any characters which UTF-8 encodes into multiple bytes each:

* 'e' uses as many as 72 UTF-8 encoding bytes in the pattern that's replicating until it's 72 bytes. It should use a pattern no longer than 56 UTF-8 bytes regardless of $len(key).
* 'el' and 'cl' incorrectly accepts a key parameter whose 56-character string has a UTF-8 encoded length longer than 56 bytes
* 'el' and 'cl' incorrectly reject 56-byte UTF-8 encoded strings because the non-encoded length is less than 56.

'/BFtest B' uses the 'el' switches to require length of exactly 56. This key is a 5-character string that's repeated 11 times plus an ending 'a' for a key $len() of 5*11+1=56. However this new key contains characters which UTF-8 encodes to more than 1 byte, so %key has a $utfencode() length of 7*11+1=78. The only way to match $encode's output is when the 72-byte expanded password does not repeat any portion of the key, and instead becomes the first 72 out of those 78 bytes, a string which cannot be obtained by doing as it should, replicating no more than the first 56 UTF-8 encoded bytes.

4) '/BFtest C' has 'cl' replicating the 'el' behavior by using the same password, but XOR's the plaintext by the same ASCII value used to fill the fixed IV. This allows the 'cl' CBC feedback to encrypt only the 1st 8-byte block the same way as done by 'el'. In CBC feedback, the 1st plaintext block is XOR'ed against the IV before it's encrypted, so this XOR causes the IV in 'cl' to XOR the 1st plaintext block back to the same 'BLOWFISH' string seen by 'el'.

Changing %IVchar from 7 to a number in the 128-255 range doesn't alter the encrypted ciphertext, so the 4th parameter isn't UTF-8 encoded for the 'i' switch and probably not for 's' either. I'm pretty sure this is compatible behavior with OpenSSL, as UTF-8 encoding the 's' and 'i' values would greatly reduce the possible values which could fit within 8 bytes, and could cause some to be unexpectedly chopped.

If altering this snippet for testing different key, remember that Blowfish keys whose UTF-8 encoded length is not longer than half the allowed 56 length are equivalent, because they expand to the same 72-byte expanded keys. i.e. 'abcd' and 'abcdabcd' are the same key, as are '/BFtest a 26' and '/BFtest a 52'. Because of the current incorrect 'e' expansion of the key, 'BFtest a 72' is possible and is identical to 26 and 52 instead of being chopped to 56 and making a different expanded key.

Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
Thanks for your bug report. Can you summarize the issue in one, short paragraph with a minimal script that reproduces the issue? :-)

1. Summary of issue in one or two lines.
2. Call to only one or two commands/identifiers, if possible.
3. Current output: X.
4. Expected output: Y.

That's all I need. If I cannot understand a short bug report, or cannot reproduce the issue, I will ask for more details.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
2 issues related to invalid key lengths:

#1) When key is longer than 56 bytes (not necessarily same as $len() = 56), Blowfish should either reject as invalid length or chop at 56 bytes so it can repeat a minimum 16 bytes while expanding key to 72 bytes. %badkey returns the same test vector as in the link above, but should have returned the same string as %goodkey because both share the same 1st 56 bytes, and therefore should have expanded to the same 72-byte pattern.

Code:
alias Test_e {
  var %badkey $left($str(abcdefghijklmnopqrstuvwxyz,3),72) | var %goodkey $left(%badkey,56)
  bset -t &data1 1 BLOWFISH | noop $encode(&data1,bme,%badkey ) | noop $decode(&data1,bm) | echo 4 -a Wrong: $bvar(&data1,1-8)
  bset -t &data2 1 BLOWFISH | noop $encode(&data2,bme,%goodkey) | noop $decode(&data2,bm) | echo 3 -a Correct: $bvar(&data2,1-8)
}



#2) el and cl correctly give UTF-8 $utfencode strings to Blowfish, but incorrectly validate the $utfdecode length, accepting 57-72 byte keys because $len() is 56, but rejecting 56-byte key which have $len() shorter.

I got 'el' and 'cl' to both return the same vector by having cl's XOR of lower-case plaintext and IV of eight 0x20 spaces cancel each other out.

'cl' should have either rejected %badkey as having 57 bytes or used key containing the 56 bytes of $bvar(&bad57,1-56) and should have accepted $len() 55 %goodkey as a valid key containing 56 bytes.

Correct ciphertext for key being first 56 bytes of $bvar(&bad57,1-56) is:

hex: 43 37 A2 45 17 96 A3 01
decimal: 67 55 162 69 23 150 163 1

Code:
alias Test_cl {
  var %badkey  $str(a,55) $+ $chr(233)
  var %goodkey $str(a,54) $+ $chr(233)
  bset -t &data1 1 BLOWFISH | noop $encode(&data1,bmel ,%badkey                 ) | noop $decode(&data1,bm) | echo 4 -a 57-byte key: $bvar(&data1,1-8)
  bset -t &data2 1 blowfish | noop $encode(&data2,bmcli,%badkey,$str($chr(32),8)) | noop $decode(&data2,bm) | echo 4 -a 57-byte key: $bvar(&data2,1-8)

  bset -t &bad57  1 %badkey  | echo -a Above Accepts $bvar(&bad57 ,0) bytes: $bvar(&bad57 ,1-)
  bset -t &good56 1 %goodkey | echo -a Below Rejects $bvar(&good56,0) bytes: $bvar(&good56,1-)

  echo -a Next 2 lines should return same vector, key has 56 UTF-8 bytes but 'cli' rejects as invalid parameter:
  bset -t &data1 1 BLOWFISH | noop $encode(&data1,bme  ,%goodkey                 ) | noop $decode(&data1,bm) | echo 3 -a 56-byte key: $bvar(&data1,1-8)
  bset -t &data2 1 blowfish | noop $encode(&data2,bmcli,%goodkey,$str($chr(32),8)) | noop $decode(&data2,bm) | echo 3 -a 56-byte key: $bvar(&data2,1-8)
}


I don't know how 'c' without 'l' hashes the key, but I expect that it correctly inputs the UTF-8 bytes to the hash, and returns the correct hash output. Since the hash output is shorter than 56, there will not be an issue of invalid key length there.

Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
Thanks. I was able to reproduce both issues. I have changed the behaviour so that the key and the salt/iv are chopped at 56 and 8 UTF-8 characters respectively. The encode/decode routines treat both of these as circular buffers either way. These changes will be in the next version.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
Quote:
so that the key and the salt/iv are chopped at 56 and 8 UTF-8 characters respectively


Thanks, but a clarification: I did not mention the UTF-8 issue related to Salt/IV in my latest post, because I did not think $encode handles Salt/IV wrong. The only reason one of my examples used an IV was that an IV was the only way to keep 'cl' from using a random salt, allowing to demonstrate the 'cl' key's UTF-8 behavior in an unchanging vector.

Unless you're finding $encode is not handling ASCII 128-255 within IV in a compatible way with OpenSSL, I don't think the Salt/IV need changing. Current behavior is to not UTF-8 encode the IV/Salt into longer byte strings when ASCII 128-255 are used, so I was not finding a length issue there.

The red/blue lines show that Salt and IV are not storing ASCII 128-255 into the ciphertext header as UTF-8 byte pairs.

The blue/maroon lines show $encode doesn't use a UTF-8 encoded IV internally either, or else blue/maroon would not have matching ciphertexts. Matching ciphertexts could not happen if maroon's IV is full of UTF-8 byte pairs while the binary plaintext contains 8 identical bytes. I assume the switch "s" Salt is handled the same way as IV internally, but I couldn't verify without knowing how $encode hashes the key and Salt together.

(Bump for my feature request to permit defining key/salt/iv as binary variables using capital switches KSI.)

Code:
alias test_ivsalt {
  var %data8 abc $+ $chr(233) $+ $chr(233) $+ def | echo -a iv/salt for red/green is %data8
  bset -t &data1 1 BLOWFISH1234567 | noop $encode(&data1,bmcri,key,%data8) | noop $decode(&data1,bm) | echo 3 -a As Text: $bvar(&data1,1-).text | echo 3 -a Len $bvar(&data1,0) Bytes: $bvar(&data1,1-)
  bset -t &data2 1 BLOWFISH1234567 | noop $encode(&data2,bmcs ,key,%data8) | noop $decode(&data2,bm) | echo 4 -a As Text: $bvar(&data2,1-).text | echo 4 -a Len $bvar(&data2,0) Bytes: $bvar(&data2,1-)

  var %asc 116
  var %xor1 000 | var %iv1 $str($chr($xor(%asc,%xor1)),8)
  var %xor2 157 | var %iv2 $str($chr($xor(%asc,%xor2)),8)
  bset -c &data1 1 $str($xor(%asc,%xor1) $chr(32),8) $str(a $chr(32),7) | echo 2 -a XOR by %xor1 Data: $bvar(&data1,1-) IV: %iv1 | noop $encode(&data1,bmcri,key,%iv1) | noop $decode(&data1,bm) | echo 2 -a As Text: $bvar(&data1,1-).text | echo 2 -a Len $bvar(&data1,0) Bytes: $bvar(&data1,1-)
  bset -c &data2 1 $str($xor(%asc,%xor2) $chr(32),8) $str(a $chr(32),7) | echo 5 -a XOR by %xor2 Data: $bvar(&data2,1-) IV: %iv2 | noop $encode(&data2,bmcri,key,%iv2) | noop $decode(&data2,bm) | echo 5 -a As Text: $bvar(&data2,1-).text | echo 5 -a Len $bvar(&data2,0) Bytes: $bvar(&data2,1-)
}


Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
Thanks, yes, the only change is that both are chopped at the appropriate length, although looking at the code this is not necessary for the salt/iv as the routines that use it only use the first 8 bytes anyway.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
Thanks. Looks like both Blowfish bugs are fixed. I can't get it to use 57-or-more bytes of the 'key' parameter - whether or not it contains UTF-8 byte-pairs, and short lengths of binary variables no longer return "line too long" error.

There is a change caused during this fix. The 'l' switch now causes a literal key regardless of length, no longer enforcing length 56. This means 'l' has no effect when used with the 'e' switch, and 'cl' accepts a literal key of any length of 1-56 with CBC feedback, and doesn't hash the key. This results in output compatible with 'e' for the first 8-byte block only, as long as 'cli' uses an IV which is the XOR of its own data and the data used by 'e'.

It's up to you whether you consider this a bug. I think it's fine to leave it as-is, since the new 'l' behavior for non-56 length is returning values where it had formerly returned an error. It would just need /help to remove the "must be 56 characters" portion of the 'l' description.

This is the earlier post's link's test vector, and new beta returns the same output as before:

Code:
//bset -t &data 1 BLOWFISH | noop $encode(&data,bme, abcdefghijklmnopqrstuvwxyz ) | noop $decode(&data,bm) | echo 3 -a $bvar(&data,1-8)


If you change the above switches from 'bme' to 'bmel', you now get the same output. Previously, adding the 'l' switch generated an error because the key length wasn't 56.

'cl' now allows literal keys shorter than 56, and I can tell the 'cl' switches use literal instead of hashed keys for non-56 lengths by the fact that below returns the same test vector output as above for just the 1st 8 bytes, because the IV uses the space character which is the XOR of UPPER/lower case data:

Code:
//bset -t &data 1 blowfish | noop $encode(&data,bmcli, abcdefghijklmnopqrstuvwxyz , $str($chr(32),8) ) | noop $decode(&data,bm) | echo 4 -a $bvar(&data,1-8)


Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
Thanks, I have removed the "must be 56 characters" portion of the 'l' description for the next version.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
I see the salt/iv parameter is no longer forced to be exactly length 8, truncating to 8 if longer, and if the parameter is shorter than 8, it encrypts using a salt/iv padded with $chr(0)'s to length 8.

However, when the salt/iv parameter is used but the contents are $null, $encode now returns $null and the binary variable as the 1st parameter is unchanged. Not sure if this is intended. Perhaps it would be better if it either returns an error for invalid parameter or encrypts using salt/iv consisting of 8 $chr(0)'s.

Code:
//var %salt $null | bset -t &data 1 TEST | echo -a result: $encode(&data,bmcs,key,%salt) | echo -a contents: $bvar(&data,1-).text


Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
This is actually typical of most identifiers. If a parameter is $null and the identifier does accept $null for that parameter, it returns $null as the result and does not process the identifier. In this case, $null is not allowed for the salt parameter, so the identifier returns $null without doing anything.

In this case, I can change $encode()/$decode() to allow $null for all parameters. This will be in the next beta.

Update: Actually, making that change could be a problem as it could affect backwards compatibility. I will make the above change so that it only applies to the 'ec' encryption switches.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
This is a follow-up post to correct part of my earlier post, and to document Blowfish issues for anyone searching the forum later.

There are 2-3 bug-ish behaviors in the current handling of the key or salt|IV parameters.

1. It turns out that key input longer than 56 bytes is not quite the 100% invalid as I had thought.

2. key parameter longer than 56 bytes is silently chopping extra bytes

3. salt|IV parameter is silently turning all codepoints 256+ into the '?' character.

--

1. The issue of whether the max input string is 56 or 72 is caused by conflicting information. A good discussion of this issue is at:


I do agree with the conclusion in that thread, that input could be allowed up to 72 bytes if it's done as an optional switch.

The documents describing the Blowfish design all say it's a 448-bit cipher, and state that not all bits of the final 16-of-72 bytes affect all bits of the encrypted data. A 72-byte key consisting of 56 bytes worth of constants and 16 bytes of actual key material would have lesser strength if the 16 bytes of secret key material were located in bytes 57-72 than if it were anywhere earlier in the key.

The confusion comes from some of the example source codes linked at Schneier's Blowfish page which don't all check for input longer than 56, or even whether it's longer than 72, meaning that someone could use that code to make use of keys longer than 56 bytes. The test vectors linked there give examples only of key lengths of 1-24 bytes, with none consisting entirely of text in the ASCII 33-126 range.

2. Keys now being chopped after byte 56 means that people can use $encode key parameters containing data that doesn't affect the encryption, but the lact of effect is disguised by the default random salt changing the ciphertext each time. These examples always generate identical ciphertext because the changing portion of the key parameter exists beyond position 56:

Code:
//echo -a $encode(testtest,cms,$str(a,55) $+ $rand($chr(192),$chr(255)) ,saltsalt )
//echo -a $encode(testtest,cms,$str(a,56) $+ $rand(a,z) ,saltsalt )
//echo -a $encode(testtest,cms,$str(a,72) $+ $str(b,$rand(0,1)) ,saltsalt )


A solution could be to allow a new switch to be valid only with byte lengths 57-72, and reject a key parameter longer than 56 bytes without using the switch. This should restore support for pre7.52 keys which were longer than 56 bytes, while also alerting people trying to use a key which does not conform to the official Blowfish design.

3. The salt|IV parameter does not UTF-8 encode codepoints 128-255, which is good because that would greatly reduce the number of unique user-defined salt|IV strings that could fit into the 8-byte string. However if codepoints 256+ are used with 's' or 'i', it substitutes them with the 0x3f '?' character, potentially causing duplicate salt|IV strings. I'm not sure what a fix would be, other than rejecting salt|IV strings that would create an IV longer than 8 bytes, allowing the salt|IV parameter to be listed as a hex string or treating salt|IV parameter containing codepoint 256+ as an invalid parameter.
Code:
//var %iv ABCDEFGH | bset -t &v 1 $encode(&v,cmir,key,%iv) | noop $decode(&v,bm) | echo -a $bvar(&v,1-16)

output: 82 97 110 100 111 109 73 86 65 66 67 68 69 70 71 72


By decoding only the mime layer, this reveals that the encrypted string has a header "RandomIV" followed by the IV parameter as bytes 9-16. The next example places codepoints from both ranges 128-255 and 256+ into the IV:
Code:
//var %iv $chr(233) $+ $chr(234) $+ $chr(10004) $+ $chr(10005) $+ ABCD | bset -t &v 1 $encode(&v,cmir,key,%iv) | noop $decode(&v,bm) | echo -a $bvar(&v,1-16) | clipboard $bvar(&v,1-16)

output: 82 97 110 100 111 109 73 86 233 234 63 63 65 66 67 68


The 128-255 characters are correctly placed as a single byte, but all codepoints I've tested in the 256+ range are replaced with the 63 "?" character.

For the above 2 examples, the same thing happens with the user defined salt parameter. When replacing the 'ir' switches with 's', the "RandomIV" header is replaced with "Salted__", and the same bytes appearing in the 9-16 position are used as the salt instead of as the IV.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
Bug: CBC with non 'l' switch chops the key parameter's input to the underlying $md5 hash function.

The limit of 56-bytes is being applied as an output filter to the key parameter, instead of being applied only as the input filter to the key schedule subroutine which accepts a string of 1-56 bytes and expands it to length 72.

This affects the CBC switch combos which do not use the 'l' switch: 'cr' and 'ci' which use hash(key parameter) as input to calculating (secret56), and 'c' and 'cs' use hash(key_parameter:salt) to generate 64 digits for (secret56:IV8). The purpose of the $md5 hash is to ensure the input can't be longer than the 56 limit, so there's no reason to limit that type of input. This alias shows 4 examples of 94-byte key parameters which are able to be decrypted with non-literal 56-byte keys, where this is caused by the input to md5(string) limiting the 'key' portion of the input at 56 bytes:

Code:
alias bf_hash_input_limit56 {
  var -s %key94 $regsubex(junk,$str(x,94),/x/g,$chr($calc(32+ \n))) , %key56 $left(%key94,56)
  var -s %a $encode('mc' hash key94 + random salt,mc,%key94)
  echo -a $decode(%a,mc,%key56)
  var -s %a $encode('mcs' hash key94 + fixed salt,mcs,%key94,SaltOrIV)
  echo -a $decode(%a,mc,%key56)
  var -s %a $encode('mcr' hash key94 only + random iv  ,mcr,%key94)
  echo -a $decode(%a,mcr,%key56)
  var -s %a $encode('mci' hash key94 only + fixediv  ,mci,%key94,SaltOrIV)
  echo -a $decode(%a,mci,%key56,SaltOrIV)
}

http://www.herongyang.com/Blowfish/Perl-Crypt-CBC-Salted-Key-Test-Cases.html

The 2nd of 4 examples on this page is a test vector where the input key is a binary string of length 80 bytes. It does not display the 56-byte secret key generated by the hash, but it does display the IV as 3c05d2f32c8d1d14, which matches the output from the above algorithm modified to accept binary keys, and which does not limit the length of the string being hashed by $md5. In this alias, $md5 is hashing a binary string of length 16+56+8=80 bytes, and the test vector's IV is the last 8 of 64 bytes generated by the hashing subroutine.

Code:
alias Openssl_salted_keygen_binary {
  bunset &raw &key
  var -s %binkey 1122334455667788990011223344556677889900112233445566778899001122334455667788990011223344556677889900112233445566778899001122334455667788990011223344556677889900
  var -s %binsalt 0000000000000000
  bset -c &pass+salt 1 $regsubex(junk,%binkey $+ %binsalt,/(..)/g,$base(\t,16,10) $chr(32))
  while ($bvar(&key,0) < $calc(56+8)) { noop $salted_digest_to_binary }
  bcopy -c &pass 1 &key 1 56 | bcopy -c &iv  1 &key 57 8
  var %bin_key  $regsubex($bvar(&key,1-56) ,/(\d+)/g,$base(\t,10,16,2) $chr(32))
  var %bin_salt $regsubex($bvar(&key,57- ) ,/(\d+)/g,$base(\t,10,16,2) $chr(32))
  echo 4 -a literal key in hex: %bin_key
  echo 4 -a literal  iv in hex: %bin_salt
}
alias salted_digest_to_binary {
  if ($bvar(&hash,0)) bcopy -c &raw 1 &hash 1 -1 | bcopy &raw -1 &pass+salt 1 -1
  bset -c &hash 1 $regsubex($md5(&raw,1),/(..)/g,$base(\1,16,10) $chr(32)) | bcopy &key -1 &hash 1 -1
  if ($bvar(&key,0) > $calc(56+8)) bcopy -c &key 64 &key 64 1
  echo 3 -a $bvar(&key,0) of 64 generated: $regsubex($bvar(&key,1-) ,/(\d+)/g,$base(\t,10,16,2) $chr(32))
}

Result from running: /Openssl_salted_keygen_binary

literal key in hex: C3 63 D2 5C 49 8B 5B E0 D5 5C 23 38 06 D8 89 BC 73 4C 49 FE E8 71 BB E1 73 24 0C 38 EA CC B8 5A 73 9D BA 62 8D 2D 64 15 FE 34 61 EC 17 69 02 E8 71 15 CE 92 A7 81 EA 34
literal iv in hex: 3C 05 D2 F3 2C 8D 1D 14

... which matches the IV shown in the linked test vector.

--

The other routine used by 'cr' and 'ci' is described at:

http://www.drdobbs.com/web-development/encryption-using-cryptcbc/184416083

It limits the non-literal key's strength to 128 bits because it expands the $md5 hash digest from 16 to 56 bytes in a way where identical first-16 bytes must always have identical bytes 17-56. After describing the method of md5-hashing the non-literal key, it says:

"On Line 7, it is perfectly all right to use a key of arbitrary length because regenerate_key is set to 1."

I read 'arbitrary' as meaning the input to the hash is allowed to be any length, including those longer than the 56 limit of the secret key. I haven't located any test vectors to prove that 'arbitrary' means md5(string longer than 56).

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
In addition to limiting the key-parameter's portion of the input to the md5 hash function at 56 bytes, $encode is also not treating the Blowfish key parameter as required, permitting it to be $null, resulting in numerous switch configurations where there's little or no secret material used to generate the key.

Code:
//var %a $null | echo -a $encode(test,mcrl,%a)
//var %a $null | echo -a $encode(test,me,%a)
both keys are same as if the key 0x00 hex is used
//var %a $null | echo -a $encode(test,mcr,%a)
key derived from binary digest of $md5($null)
//var %a $null | echo -a $encode(test,mcs,%a,SaltSalt)
key derived from binary digest of $md5($null $+ SaltSalt)

Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
Quote:
In addition to limiting the key-parameter's portion of the input to the md5 hash function at 56 bytes, $encode is also not treating the Blowfish key parameter as required, permitting it to be $null, resulting in numerous switch configurations where there's little or no secret material used to generate the key.

Isn't this the change that I mentioned in my previous post? That it now allows all parameters to be $null. It is up to the scripter to provide the correct parameters.

Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
I have to say, I feel like I am going around in circles with this. When I first implemented these identifiers, they were intended to be OpenSSL-compatible by default. However, the scripters testing them made a number of requests regarding how parameters should be handled/truncated/converted to UTF-8/etc. and they ended up not being OpenSSL-compatible by default. The code contains numerous commented out checks for lengths/conversions/etc that were originally used but were changed on request.

Quote:
1. It turns out that key input longer than 56 bytes is not quite the 100% invalid as I had thought.

2. key parameter longer than 56 bytes is silently chopping extra bytes

A solution could be to allow a new switch to be valid only with byte lengths 57-72, and reject a key parameter longer than 56 bytes without using the switch. This should restore support for pre7.52 keys which were longer than 56 bytes, while also alerting people trying to use a key which does not conform to the official Blowfish design.

Okay, I will revert this change so that literal keys are limited to 56 bytes again but non-literal keys are not. This will halt/break scripts that use longer literal keys and break scripts that use longer non-literal keys.

Quote:
3. salt|IV parameter is silently turning all codepoints 256+ into the '?' character.

I'm not sure what a fix would be, other than rejecting salt|IV strings that would create an IV longer than 8 bytes, allowing the salt|IV parameter to be listed as a hex string or treating salt|IV parameter containing codepoint 256+ as an invalid parameter.

There is commented out code that rejects salt/IVs not 8 bytes long - someone requested that any length be allowed. I will change it so that it will halt with an error if a scripter tries to use codepoints 256+ in the salt/IV. This will break all scripts that use codepoints 256+ in the salt/IV.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
Quote:
Actually, instead of halting the script with an error, it could just UTF-8 encode codepoints 256+. This would still break older scripts but it would allow newer scripts to use Unicode characters if they wished.


As long as the IV/Salts can only be input as a text string, the current behavior of not UTF8 encoding codepoints 128-255 is good, because the alternative is to greatly reduce the number of possible salts. Allowing salts shorter than 8 also increases the number of possible salts, since that's the only way to get 0x00 into the salt. If using $chr(10004) were another way of using $chr(226) $+ $chr(156) $+ $chr(148) that would be better than silently changing all 256+'s into codepoint 63's, as long as someone's use of $chr(10004) didn't cause it to silently ignore 2 other characters of the salt.

As for breaking older scripts, they could still decode their messages by changing those characters into questionmark 63's.

Unless it needs to maintain compatibility with something else that does so, I don't think it's a good idea to silently ignore portions of the key/salt/IV parameters because they're too long. If it's longer than a valid encryption parameter, then it can't be a valid parameter. The switch combos not using the 'l' literal use the key parameter as input to an MD5 hashing, and it seems reasonable that a string of any length should be valid as input to a hash digest. So far, the only example I can find describing the key being hashed the same way $encode hashes non-literal keys is from the Crypt::CBC package, and I showed a link where it accepted MD5 input longer than 56 bytes, then use the output of the recursive hashing to make a 56 byte key out of it.

I guess a 'chop me' switch could be confusing if it makes the user think their 16 byte Salt or 80-byte literal key would not be chopped in the absence of that switch. I was thinking of it being more for the edge case of forcing too-long keys to be used in a valid manner, including the current handling of a UTF encoded character that's split across both sides of the 56 border.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
The CBC 'salt' should be a 64-bit binary value regardless of contents, but instead is sometimes truncated to a shorter string when $encode removes any 0x00 byte and anything following it. I also found this in v7.55 no-beta and v7.36, so it's not related to the recent fixes.

This results in 1 of every 256 random 8-byte salt strings handled by being truncated to $null due to the first byte randomly being 0x00. There are additional identical truncated salts where the first occurrence of the 0x00 is at a different byte position, increasing the frequency of these dupes. The test vector I linked above produces the correct outcome only when that salt of all zeroes is handled as eight 0x00 bytes instead of appending $null to the password.

Bytes 9-16 within the mime string contains the 8 byte salt, so this portion of the string is almost always unique when containing a random salt string. However the encrypted data is often identical because the salted key and IV are created using matching truncated salts. This next example almost always finds somewhere between 100-130 messages using a truncated salt that's previously used in this same group of messages.

Note: this alias takes approx 3 minutes, and the display total does not include salts which were truncated without being duplicated. If they were included, the total would have been closer to 800.

Code:
//var %i 25600 , %count 0 | hfree -w salt | while (%i) { var %a $mid($encode(abcdefghijklmnopqrstuvwxyz,mc,key),23) | if ($hfind(salt,%a)) { inc %count | echo -a i: %i match# %count : $v1 } | else hadd -m salt %a | dec %i }


Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
Update:

The true ratio of random salts containing the 0x00 byte would be 1 out of every 32.44 messages using an incompatible salt shorter than 8 bytes.

Code:
//echo -a $calc(1/ (1-(255/256)^8) )


The workaround until this is fixed would be to limit the salt to one of the 97% of salt strings which don't contain the 0x00 byte

Code:
alias randsalt returnex $regsubex($str(x,8),/x/g,$chr($rands(1,255)))


then use it with the 's' parameter when creating salted messages

Code:
$encode(message,mcs,key,$randsalt)


After the fix, I estimate that this should decrypt as normal text:

Code:
//echo -a $decode(U2FsdGVkX18AAAEAAOkAAIC4MK5GejRGGZ/W3g4BJoY=,mc,test key)

and
Code:
//echo -a $encode(message,mcs,key1234,5678)

result now: U2FsdGVkX181Njc4AAAAAOZf+ZmPFy3l same as $encode(message,mcs,key,12345678)
after fix:: U2FsdGVkX181Njc4AAAAAKR3f8+Rmn/r


Joined: Dec 2002
Posts: 5,408
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,408
The salt is a string, a %var not a &binvar. It cannot contain null characters. If you would like the salt to be interpreted as a &binvar, we would need to add a new switch.

Joined: Feb 2003
Posts: 2,812
Hoopy frood
Offline
Hoopy frood
Joined: Feb 2003
Posts: 2,812
I think for futurization with other ciphers and with other applications using these ciphers, it may be necessary to support either binvar content or hexadecimal strings.

My argument would be to automatically detect and support a &binvar if that &binvar presently exists, else treat it as a literal string that happens to begin with an ampersand.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!
Page 1 of 2 1 2

Link Copied to Clipboard