I'm assuming the major goal for $encode is to be compatible with OpenSSL, so 'e' and 'cl' should handle code points 128+ or too-long key parameters the same way OpenSSL does (or should). $encode violates the Blowfish cipher's design requirements in hopefully rare cases:

* when using 'e' with $len(key) or $len($utfencode(key)) greater than 56. There's no issue with shorter keys containing codepoints 128+.

* Using 'el' or 'cl' where the key string contains characters in the 128+ range which are UTF-8 encoded into multiple bytes each. It rejects $len() strings shorter than 56 having the correct $utfencode() length of 56 bytes, but accepts $len() 56 strings incorrectly having more than 56 UTF-8 bytes.

If there's a decision to maintain backwards compatibility for people who have usages where they used too-long keys, perhaps adding a status window warning advising them to change this weak key.


The Blowfish design of 16-round Blowfish uses an expanded password string of 72 bytes, which it fills by allowing the input string to have no more than 56 bytes, then replicates that string as many times as needed until it contains 72 bytes. This requires at least 16 bytes of the 72-byte expanded key be repeats, making it difficult for an attack where someone creates completely different 72-byte expanded passwords which have matching 'sub-keys' within the encryption cipher.

When 'e' 'el' or 'cl' sends literal keys to Blowfish, they're using $utfencode(key) as the replicated byte pattern, as they should. But they're checking $len(key) instead of $len($utfencode(key)) to determine whether the string has the correct number of bytes.

Assuming mIRC is otherwise compatible with OpenSSL, its current behavior is equivalent to "bset -t &key 1 key-parameter" without the 'a' switch. The only changes needed to fix the length issue are:

* using $bvar(&key,1-56) instead of $bvar(&key,1-72) as the byte pattern expanded to 72 bytes. This should apply only to 'e' or 'cl' since hashing of long non-l 'c' strings doesn't violate Blowfish's design because they use 'hash(key+other)' instead of 'key'.
* the 'l' switch needs to check $bvar(&key,0) instead of $len(key-parameter) when verifying the length is 56.
* decide for compatibility purposes how to handle currently permitted illegal passwords whose $len($utfencode(key)) lengths are greater than 56.

The part below is just demonstrating how I identified that too-long UTF-8 encoded byte strings were being used.


Virtually all the test vectors for Blowfish have keys having bytes which get expanded by UTF-8 into bytes pairs, so those test vectors can't be replicated in mIRC without using a binary variable as the key, but that's a separate feature request. Here's an early Blowfish article which shows a test vector using only 7-bit text for both plaintext and key:


************** TEST VECTORS ***********************************
This is a test vector.
Plaintext is "BLOWFISH".
The key is "abcdefghijklmnopqrstuvwxyz".

#define PL 0x424c4f57l
#define PR 0x46495348l
#define CL 0x324ed0fel
#define CR 0xf413a203l
static char keey[]="abcdefghijklmnopqrstuvwxyz";

1) After loading the following code, using '/BFtest A 26' recreates the ciphertext defined in the article, where the plaintext is the 8-byte string BLOWFISH and the key is the 26-byte alphabet. The alias converts the ciphertext display of $bvar(&var,1-) from the 1-3 digit decimals to 2-digit hex, making it easier to match the vector listed in hex.

alias bvar2hex { var %h $1- | var %i $numtok(%h,32) | while (%i) { var %h $puttok(%h,$base($gettok(%h,%i,32),10,16,2),%i,32) | dec %i } | return %h }
alias BFtest {
  echo -a ---- /BFtest <a|b|c> [Length] | var %switches bm $+ $iif($1 isin c,cri,e) $+ $iif($1 isin bc,l) | bset -t &var 1 BLOWFISH123 | ; the 123 avoids length-8-binary bug
  var %IVchar 7 | var %ChopLength $iif(($2 isnum 1-) && (l !isincs %switches),$int($2),56)
  if ($1 isin c) { var %i 8 | while (%i) { bset &var %i $xor($bvar(&var,%i),%IVchar) | dec %i } }
  if ($1 isin a)  var %key $str(abcdefghijklmnopqrstuvwxyz,3)
  if ($1 isin bc) var %key $str(abc $+ $chr(233) $+ $chr(233),19) | var %key $left(%key,%ChopLength)
  echo switches: %switches keylen: $len(%key) utflen: $len($utfencode(%key)) key: %key plaintext hex/text: $bvar2hex($bvar(&var,1-8)) $bvar(&var,1-8).text
  if ($1 isin c) noop $encode(&var,%switches,%key,$str($chr(%IVchar),8))
  else           noop $encode(&var,%switches,%key)
  noop                $decode(&var,bm) | var %range 1-8 | if ($1 isin c) var %range 17-24
  echo $chr(3) $+ 0,4ciphertext range %range $+ : $bvar2hex($bvar(&var,%range)) / $bvar(&var,1-).text
  if ($1 isin abc) { bset -t &binkey 1 %key | echo pattern replicated to fill 72-byte expanded pass: $bvar(&binkey,1-72) $iif($bvar(&binkey,0) isnum 57-,$chr(22) bytes at the end that should not be used: $bvar(&binkey,57-72)) }

2) Using '/BFtest A 57' shows the ciphertext output where the key is the alphabet repeated until it's an invalid key length of 57. It expands to the 72-byte expanded password in a way that can't be created from the correct method of expanding a length 1-56 string to 72 bytes. I was able to duplicate mIRC's encoded output only by altering a Blowfish utility to allow the max input key length be greater than 56.

This issue should only affect switch 'e' and 'cl', because 'c' without 'l' hashes the key string to fewer than 56 bytes, regardless of the length of the key parameter.

3) As I understand OpenSSL's stated behavior, it converts keyboard input to UTF-8 bytes, and it looks like mIRC does that with the 'key' parameter when using the 'e' switch, and 'cl' does the same thing. However, if this string contains any characters which UTF-8 encodes into multiple bytes each:

* 'e' uses as many as 72 UTF-8 encoding bytes in the pattern that's replicating until it's 72 bytes. It should use a pattern no longer than 56 UTF-8 bytes regardless of $len(key).
* 'el' and 'cl' incorrectly accepts a key parameter whose 56-character string has a UTF-8 encoded length longer than 56 bytes
* 'el' and 'cl' incorrectly reject 56-byte UTF-8 encoded strings because the non-encoded length is less than 56.

'/BFtest B' uses the 'el' switches to require length of exactly 56. This key is a 5-character string that's repeated 11 times plus an ending 'a' for a key $len() of 5*11+1=56. However this new key contains characters which UTF-8 encodes to more than 1 byte, so %key has a $utfencode() length of 7*11+1=78. The only way to match $encode's output is when the 72-byte expanded password does not repeat any portion of the key, and instead becomes the first 72 out of those 78 bytes, a string which cannot be obtained by doing as it should, replicating no more than the first 56 UTF-8 encoded bytes.

4) '/BFtest C' has 'cl' replicating the 'el' behavior by using the same password, but XOR's the plaintext by the same ASCII value used to fill the fixed IV. This allows the 'cl' CBC feedback to encrypt only the 1st 8-byte block the same way as done by 'el'. In CBC feedback, the 1st plaintext block is XOR'ed against the IV before it's encrypted, so this XOR causes the IV in 'cl' to XOR the 1st plaintext block back to the same 'BLOWFISH' string seen by 'el'.

Changing %IVchar from 7 to a number in the 128-255 range doesn't alter the encrypted ciphertext, so the 4th parameter isn't UTF-8 encoded for the 'i' switch and probably not for 's' either. I'm pretty sure this is compatible behavior with OpenSSL, as UTF-8 encoding the 's' and 'i' values would greatly reduce the possible values which could fit within 8 bytes, and could cause some to be unexpectedly chopped.

If altering this snippet for testing different key, remember that Blowfish keys whose UTF-8 encoded length is not longer than half the allowed 56 length are equivalent, because they expand to the same 72-byte expanded keys. i.e. 'abcd' and 'abcdabcd' are the same key, as are '/BFtest a 26' and '/BFtest a 52'. Because of the current incorrect 'e' expansion of the key, 'BFtest a 72' is possible and is identical to 26 and 52 instead of being chopped to 56 and making a different expanded key.