mIRC Home    About    Download    Register    News    Help

Print Thread
#263025 15/05/18 07:37 AM
Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
In a continuation of the other $hotp bug report. $hotp decodes key strings in a way that doesn't appear to be documented. The /help page for $hotp describes 4 methods of presenting the key:

Hex strings of length 40/64/128
Base32 strings of length 16/24/32
Google Authenticator Format
plain text

It specifically says that it decodes Base32 strings of the 3 lengths 16/24/32, however I've found that as long as the key is a string that's *any* multiple of 8 characters greater than length 8, and contains only the 32 case-insensitive characters of the Base32 alphabet excluding the '=' padding character, $hotp is using the Base32 decoded string instead of the case-sensitive literal string. The exception is strings of length 40/64/128 containing only hexadecimal characters.

This means that someone using a key that contains the properties of a Base32 encoded string loses password strength because their password is being handled in a case-insensitive manner.

Code:
//var %i 2 | while (%i isnum 1-25) { var %a 5 * %i | var %b $regsubex($str(x,%a),/x/g,$r(!,~)) | echo -a $hotp(%b,123) $hotp($lower($encode(%b,a)),123) $hotp($upper($encode(%b,a)),123) - %a %b | inc %i }



Can you give an example of how Google Authentication format differs from the other 3 described methods of of presenting the key? Other than hex strings of lengths 40/64/128 or strings of length 16+ which could be mistaken for a Base32 encoded string which doesn't contain the Base32 padding character, all other key strings I've tested appear to handle the key as the literal text string.

Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Thanks for your bug report. This sounds like the same issue as your previous post. Since the identifer does not provide a parameter that allows you to specify the actual format of the key, mIRC guesses at what the parameter is based on the number of characters and the whether they are in a hex/base32 format. So if you provide a parameter that overlaps in some way and is ambiguous, what you describe will happen.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
The /help says it decodes only 3 specific lengths of hex strings, and it does that. The /help says it decodes only 3 specific lengths of base32 strings, but instead it's looking at all other lengths greater than 8 which are multiples of 8 without containing the '=' padding character, to see if they match the pattern of base32 encoding.

Or is this decoding of all these additional base32 lengths what /help means by 'Google Authenticator' format? The /help mentions that as a 4th way of handling keys, but I couldn't find a 4th way beyond $utfencode($remove(decoding hex,0x00)), or decoding base32, or literal text.

Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Quote:
The /help says it decodes only 3 specific lengths of base32 strings, but instead it's looking at all other lengths greater than 8 which are multiples of 8 without containing the '=' padding character, to see if they match the pattern of base32 encoding.

That is correct. The comments in my code state that it should check for >=16 and multiples of 8 instead of just 16/24/32. I cannot remember why as I implemented this feature three years ago and researched it at that time. You will need to research this, and the Google Authenticator format, yourself I'm afraid.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
In summary results of my 'research', the /help reference to "base32 format of 16/24/32 chars" and "lower case with spaces" both appear to be references to Google Authenticator format, but $hotp implements them both incorrectly. The length should be strings of 16/26/32 Base32 characters, either as a continuous string or as separated by spaces into groups of 4 for easier readability. $hotp/$totp are also measuring those lengths before removing spaces but instead should measure the length after removing spaces.

The specified lengths of hex strings appears to be designed to support applications that use binary hash digests of lengths 20/32/64 for sha1/sha256/sha512. Any such application would have no reason to strip 0x00's and UTF-8 encode the remaining bytes as if they're text.

All issues affect $hotp and $totp equally, as $totp uses a count parameter derived from $ctime instead of being a purely sequential value.

- -

The closest I can find to identifying $hotp identifier's key parameter's parsing is a quote I find on the Wikipedia page for "Google Authenticator":

Quote:

"The service provider generates an 80-bit secret key for each user (whereas RFC 4226 §4 requires 128 bits and recommends 160 bits).[39] This is provided as a 16, 26 or 32 character base32 string or as a QR code."


The /help says "base32 format of 16/24/32 chars", but the 24 appears to be trying to support Google Auth but the wrong number is printed and also incorrectly implemented into actual string handling. At first I saw the 26 as if it were a typo, but the 3 bit-lengths referenced in the quote are 80 128 and 160 bits. When excluding the '=' padding, 16 26 and 32 are the Base32 lengths which encode binary key lengths of 80 128 160:

Code:
//var %i 1 , %a 80 128 160 | while (%i isnum 1-3) { echo -a $len($remove($encode($str(X,$calc($gettok(%a,%i,32) /8)),a),=)) | inc %i }



Instead, $hotp is only decoding base32 strings whose length are an exact multiple of 8 greater than length 8, without allowing them to have any '=' padding. This means $hotp is not supporting 128-bit keys encoded as 26 characters. It appears Google Auth doesn't pad their 26-character strings with ='s.

The closest I could find where Google Authenticator is associated with base32 encoding to every multiple of 8 is the source code at: https://github.com/google/google-authent...authenticator.c

... where the line:

Quote:

#define SECRET_BITS 80 // Must be divisible by eight


... has the key length hard-coded as 80 bits, but the comment implies it can be edited to be any value that's a multiple of 8. But this is a multiple of 8 bits for the binary key, not the byte length of the base32 encoding.

Each group of 8 base32 characters can encode 40 bits, so having the base32 strings be multiples of 8 without padding assumes the binary key is always going to be a multiple of 40, which so far is true only for 80 and 160 bit keys.

The references to spaces and lower-case being part of Google Authenticator are at places like:

https://soeithelp.stanford.edu/hc/en-us/...d-or-iPod-Touch
https://garbagecollected.org/2014/09/14/how-google-authenticator-works/

The 1st link describes the key being a 26-character base32 string which can be typed as upper or lower case, and with-or-without spaces. The 2nd link shows a 160-bit key being presented as 8 groups of XXXX place-holders separated by spaces. I can't find a reference to how Google Auth handles the fact that 26 isn't a multiple of 4, but I'm guessing that there's either a couple groups of 5 or a final group of 2 characters. The keys are presented in small groups of digits to be user friendly.

I can't find reference to how rigid Google Auth is when someone enters their code using spaces, but I suspect that it allows as many spaces as the user wants, then simply removes the spaces to check if the remaining string is base32 of the appropriate length. But that can be too 'grabby' in this context where the identifier is trying to discern between literal plaintext and base32 encoded strings.

--

In addition to not supporting the 128-bit keys encoded as 26 base32 digits, $hotp is incorrectly supporting the 'lower-case-and-spaces' method, because it is measuring the 16/24/32 (instead of 16/26/32) length while they include spaces instead of verifying those lengths after the spaces are deleted. The base32 encoding of 12345678901 is the 18 character string GEZDGNBVGY3TQOJQGE. $hotp uses this string as a literal text key because the length isn't a multiple of 8. But when padded internally with 6 spaces to make the length be 24, $hotp then deletes the spaces, then it base32 decodes the remaining 18-char string into the underlying binary contents which in this example happen to also be bytes in the printable ASCII range. In this example, 3 different strings return the same password:

Code:
//var %a G E Z D G N BVGY3TQOJQGE         | echo -a $len(%a) $hotp(%a,1,sha1,9) $hotp($lower(%a),1,sha1,9) $hotp(12345678901,1,sha1,9)
//var %a G E Z D G N B VG Y3 T Q O JQ G E | echo -a $len(%a) $hotp(%a,1,sha1,9) $hotp($lower(%a),1,sha1,9) $hotp(12345678901,1,sha1,9)



As far as $hotp's checking is concerned, it doesn't matter where the spaces are inserted or how many non-space characters are in the string, as long as the spaces+alphanumeric string is a total length of 16/24/32. In the above example, the 2nd command returns identical passwords because the insert of 8 additional spaces brought the spaced-padded length to 32, causing the space-padded string to again be handled as base32.

Even if $hotp is fixed to evaluate the correct lengths after the spaces are removed, I'm not sure it's desirable to support space padding of alphanumeric strings in non 'official' groups of characters, or even supporting mixed case. This example shows the password is the same in upper/lower/mixed case due to the actual password being the base32 decoding of the 21 non-spaces into the underlying non-utf8-encoded binary string inside:

Code:
//var %a CuRiOsItY KiLlEd ThE CaT | echo -a $hotp($upper(%a),1) $hotp($lower(%a),1)  $hotp(%a,1)


Adding 8 additional non-consecutive spaces results in the key length increasing from 24 to 32, causing the same 21 non-space characters to be base32-decoded as the same key.

--

I haven't been able to track down any Google Auth references related to hex lengths of 40/64/128 chars needing to be decoded before being used as the key. Every reference of Google Auth keys being encoded has them being encoded as base32 not hex. But I can't imagine $hotp's current handling of hex strings matching any test vectors containing hex encoding of ASCII 00 or 128-255.

The 40/64/128 lengths seem obviously intended to be the hex-text display of 160/256/512 -bit key lengths where an application is wanting to decode the hex-text digests for sha1/sha256/sha512 into binary strings of length 20/32/64. This seems like the kind of thing OpenSSL would do, but I've been finding references to it encoding things as Mime or Base32 and not so much as Hex. It would not be desirable for an application to assume that the underlying contents needs to be UTF8-encoded after the 0x00's are stripped. A pair of hash digests which are identical except for location of 0x00 bytes would generate matching keys if all 0x00's were stripped.

Even when the decoded hex contents appears to already be UTF-8 encoded, the string is being encoded again, as shown by these matching passwords, where the hex encoded key already contains the UTF-8 encoding of $chr(10004):

Code:
//echo -a $hotp($utfencode($chr(10004)),1,sha1,9) $hotp($str(00,17) $+ E29C94,1,sha1,9)


--

This shows what I was trying to say in an earlier post, that Base32 and Hex16 encoded strings are not being handled the same way. The underlying decoded contents of Base32 strings are being used as their un-modified binary contents. Even though the above hex key is already a UTF8-encoded string, it is re-encoded again, so the "E2 9C 94" hex bytes are each encoded so the binary key for both usages becomes "C3 A2 C2 9C C2 94".

When $hotp recognizes a key as being a Base32 string, the key used is the binary contents that's 5/8ths as long. It does not have 0x00's stripped from the binary key nor does it have the remaining bytes UTF8-encoded. On the other hand, the underlying decoded contents of Hex16 strings are being UTF8-encoded after having 0x00's stripped.

In this example, the base32-decoded binary string is not altered, causing it to match the different literal text key and the hex-encoded key, where the hex digits are handled as if they're the encoding of non-UTF8-encoded text instead of encoding a binary hash digest. The latter 2 identical keys are obtained by UTF-8 encoding the 2nd and 3rd different strings into having the same password output for all 3 strings:

Code:
//bset &var 1 $str(195 169 $chr(32),10) | noop $encode(&var,ba) | var %a $bvar(&var,1-).text | echo -a %a $hotp(%a,1) / $hotp($str(é,10),1) / $hotp($str(00,10) $+ $str(e9,10),1)



--

To fix these issues, it seems like the hierarchy of rules for handling the key parameter needs to change. Even though that will break backwards compatibility, it would restore support for 128-bit keys encoded by Google Authenticator into 26-character base-32 strings. It would also restore compatibility with applications that expect hex digests of length 40/64/128 to be binary keys of 20/32/64 bytes. It should also avoid false-matches of language passphrases containing spaces and no punctuation simply because their space-padded lengths happened to be 16/24/32.

1st rule:
Old: If key is a length 40/64/128 case-insensitive hex string, it is decoded to become a text string that is then UTF-8 encoded, and any 0x00's stripped. If the string is entirely 0x00's, the key is $null.
New: These hex strings should instead be decoded to binary keys of length 20/32/64 the same way base32 strings are being decoded to their binary contents.

2nd rule:
Old: If key length is 16 or greater and a multiple of 8 (except for lengths 40/64/128 containing only 0-9a-f), and if it's a valid case-insensitive Base32 encoded string without spaces or '=' padding, the key is the binary decoded contents whose length is 5/8ths the length of the Base32 string, with no 0x00's stripped and no UTF-8 encoding as if the contents are text.
New: no change

3rd rule:
Old: If the key length is 16/24/32 and contains only spaces or case-insensitive Base-32 characters, the spaces are stripped and the remaining Base-32 characters of arbitrary length are decoded into a binary key.
New: The target key lengths should instead be 16/26/32, and should be compared against the string length only after the spaces are removed. The spaces should be used only to group the characters into the same pattern presented by Google Authenticator, such as groups of 4 non-spaces, and valid strings should probably not include mixed-case letters.

4th rule:
Old: Any remaining strings not matching the 1st 3 patterns are considered literal text keys, and the input is assumed to already be UTF-8 encoded where necessary.
New: no change

Joined: Dec 2002
Posts: 5,411
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2002
Posts: 5,411
Quote:
1st rule:
Old: If key is a length 40/64/128 case-insensitive hex string, it is decoded to become a text string that is then UTF-8 encoded, and any 0x00's stripped. If the string is entirely 0x00's, the key is $null.
New: These hex strings should instead be decoded to binary keys of length 20/32/64 the same way base32 strings are being decoded to their binary contents.

When I originally researched this topic, I based the design of $hotp()/$totp() on many of the real-world C/C++ examples, discussions, and examples I found. So, for example, UTF-8 encoding all of the formats was something that was common to the implementations I saw, so that is what mIRC does. UTF-8 encoding obviously breaks keys that include null bytes. So the question is, are null bytes actually allowed?

Quote:
3rd rule:
Old: If the key length is 16/24/32 and contains only spaces or case-insensitive Base-32 characters, the spaces are stripped and the remaining Base-32 characters of arbitrary length are decoded into a binary key.
New: The target key lengths should instead be 16/26/32, and should be compared against the string length only after the spaces are removed. The spaces should be used only to group the characters into the same pattern presented by Google Authenticator, such as groups of 4 non-spaces, and valid strings should probably not include mixed-case letters.

Puzzling. mIRC's implementation uses 16/24/32 because that is what other implementations were using.

As it took a lot of time to research, implement, and validate these identifiers originally, I will need to go through this process again. I have added this to my to-do list.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
As mentioned in section (2) below, question related to the issue of utf8-encoding hash strings, is this being done also by $encode's Blowfish encryption when not using the 'l' switch? For $encode being compatible with OpenSSL, I would expect $encode(33333333,ae,gryffczmj) to generate matching output to:

openssl.exe enc -bf-ecb -in 33333333.txt -out test.out -nosalt -p -pass pass:gryffczmj

but I can't get them to both make the same output, so if I'm not doing something wrong, I thought it might be caused by $encode utf-8 encoding hash outputs?

Quote:

UTF-8 encoding all of the formats was something that was common to the implementations I saw, so that is what mIRC does. UTF-8 encoding obviously breaks keys that include null bytes. So the question is, are null bytes actually allowed?


If by 'key' you mean UTF-8 encoding the command-line parameter being used as input to $hmac or $totp or $hotp, or to a hash function to generate the hash being used to derive the actual encryption key, then yes I can see that being done, as it's needed to ensure input typed by different users would generate matching keys.

But once the binary hash is created, I can't see it having 0x00's stripped and the remaining bytes being individually UTF-8 encoded. UTF-8 encoding of a hash digest isn't needed to ensure compatibility between people using different languages. But also, if hash digest output were translated to utf-8 text, it would greatly weaken the encryption in most cases.

For example, AES-128 uses a 128-bit key, so if AES-128 used the first 128-bits of a hash digest as the key, on average 50% of the bytes would be in the 128-255 range. If the hash digest were $utfencoded, each of those bytes would be replaced 1 byte with 2 bytes. 0x00's would appear on average once every 16 128-bit hash strings. The combined effect of these factors is that utf-8 encoded text strings translated from a hash digest would expand a 16-byte 128-bit digest having a 50/50 mix of 128-255/0-127 into a text string of length averaging around 24 bytes. AES-128 would chop this UTF-8 string at 16 bytes. The first 128 bits of that 23-25 byte UTF-8 encoded string would contain approximately 2/3rds of 128 bits: 85 bits. If the first 8 bytes of that hash digest had only bytes 0x80 through 0xff, utf-8 encoding the hash results in the 128-bit key only having 64 bits in the key.

Also, the way Blowfish handles the input key, a hash output of all 0x00's handled this way would crash the program. Blowfish accepts a variable input of 1-56 characters, and uses a key schedule which expands the input to 72 bytes. If the input were length 0 because it were a hash digest consisted entirely of 0x00's, it's impossible to expand 0 bytes to 72 bytes.

--

Additional evidence I've found of using hash digests only as non-utf-8 encoded binary strings:

(1) HMAC itself. I can't find test vectors involving literal keys containing non-text, but I imagine that applications accepting text input to HMAC would be UTF-8 encoding that text. If using the default sha1, that hash has a 512-bit block size, so if the literal input key string is 512 bits (64 bytes) or shorter, then that literal string is used as the 'secret key'. The input is padded with enough 0x00 bytes to make it be 512 bits. However if the input is longer than 512 bits, the input string is hashed via sha1, and the secret key is instead replaced with the 20 binary bytes of the sha1 digest then padded with 12 0x00 bytes to fill the entire 512-bit block. The sha1 hash is not utf-8 encoded here, nor is it UTF-8 encoded nor 0x00's deleted when the derivative inner/outer keys are created by XOR'ing the binary hash digest with 0x5c and 0x36. When the interior hash digest is appended to the outer key for the outer hash digest, it's not utf-8 encoded either. When that HMAC hash result is used by $totp or $hotp, it's not being utf-8 encoded while calculating the 6-digit numeric password.

--

(2) I haven't yet matched mIRC's Blowfish against another application except matching its ECB mode when using $encode's 'l' switch for literal keys. I haven't been able to match $encode vs OpenSSL for any test vectors containing 0x00's or 0x80-0xff bytes because mIRC doesn't allow &binvar as a literal key. However I have identified OpenSSL using a hash digest as a key, where it uses the binary bytes of the hash without discarding 0x00's or utf-8 encoding the remainder as if it's text. That's the reason I had made a request about using binary keys for $encode's Blowfish. It was to allow using actual binary hashes as the key, and not having them utf-8 encoded as if text.
https://forums.mirc.com/ubbthreads.php/topics/261893/New_$encode_switches

To re-create the match between OpenSSL and my assembler program both using 0x7f-0xff bytes without UTF-8 encoding them:

Code:
//write -n test.dat 33333333

Then at the command prompt:

Code:
openssl.exe enc -bf-ecb -in test.dat -out test.out -nosalt -p -pass pass:gryffczmj


This gives the key display: 10A6FD97002DF6CC087802129F8CD064

The 5th byte is a 0x00 and 3 of the first 4 bytes are ASCII 128-255. This is the first 128 bits: //echo -a $upper($left($sha256(gryffczmj),32))

The output file's first 8 bytes are: 0xAD 0x1E 0xD9 0x2E 0xc7 0x43 0xea 0x40

I match this output in my assembler program by using the 16 bytes "10 A6 FD 97 00 2D F6 CC 08 78 02 12 9F 8C D0 64" as the literal key, without removing the 0x00 nor utf-8 encoding the remainder - showing that's how OpenSSL handles hash output.

(2b)

When not using the 'l' switch, are $encode's 'e' and 'c' switches UTF-8 encoding the hash output and stripping 0x00's? Because I can't figure out how to get compatible output to the above from $encode. I can't get compatible output when I use:

Code:
//bset -t &v 1 $str(3,16) | noop $encode(&v,bae,gryffczmj) | noop $decode(&v,ba) | echo -a $regsubex($bvar(&v,1-16),/(\d+)/g,$base(\t,10,16,2))

--

(3) I finally found Google Authenticator associated with a hex16 encoding of its key, and the hex string is not UTF-8 encoded.

https://lists.open.com.au/pipermail/radiator/2011-June/017420.html

It associates the hex and base32 strings as non-utf-8 equivalents, even when they contain ASCII 128-255 or 0x00's. It shows these as equivalents:

Code:
3132333435363738393031323334353637383930   GEZD GNBV GY3T QOJQ GEZD GNBV GY3T QOJQ
d8f828609e0f4056f852e4c9d75605099f483e20   3D4C QYE6 B5AF N6CS 4TE5 OVQF BGPU QPRA
b906daef6d002ec6cc89106df25f8268ce28f95e   XEDN V33N AAXM NTEJ CBW7 EX4C NDHC R6K6
0000000000000000000000000000000000000000   AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA


These were also 4 examples of the 160-bit Google Auth key encoded as Base32 containing spaces to make it easier for the user to type the key. This translates the above Base32 strings into the above hex equivalent, with spaces for easier reading:

Code:
//bset -t &v 1 $remove(XEDN V33N AAXM NTEJ CBW7 EX4C NDHC R6K6,$chr(32)) | noop $decode(&v,ba) | echo -a $regsubex($bvar(&v,1-),/(\d+)/g,$base(\t,10,16,2))


These 4 Base32 and Base16 strings decode to the same binary bytes. When $hotp and $totp are given the string of these 32 Base32 string without spaces, they correctly decodes it into the 128-bit binary key binary keys. When divided by spaces as above, it's incorrectly handled as if a 39 char literal text key because the length including spaces isn't a multiple of 8. Adding 1 additional space between any of the groups of 4 would make the length be 40, and would decode correctly because the length-with-spaces is a multiple of 8. When given the length 40 hex string equivalent, it's instead decoded to the same 20 binary bytes as the Base32 key, but it's being further processed, having the 0x00's removed, then the remaining bytes are individually utf-8 encoded and used as if a text string.

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
Update: Changed how key parameter is parsed between the 3 rules, and more details.

https://soeithelp.stanford.edu/hc/en-us/...d-or-iPod-Touch

This refers to additional evidence that Google Authenticator format is base32 length 16/26/32 not 16/24/32. It describes the key being 26 characters long. All the examples I've seen describing the GA key being separated by spaces use the example for lengths 16 or 32 where spaces are used to split the key ONLY into groups of 4 digits, but I'm guessing that the format for the rarely used 26 characters of a 128 bit key would do the same thing, except having a final group of the leftover 2 characters. i.e.:

Code:
//var %key $regsubex($str(x,26),/(.)/g,$iif( $calc( \n % 4) == 0, \t $+ $chr(32), \t )) | echo -a $len(%key) key %key



16/26/32 are the lengths of a base32 string that would be encoding a binary key of length 80(half-sha1)/128(md5)/160(full-sha1). The hex lengths 40/64/128 are the lengths for the hash digests of sha1/sha256/sha512. The binary keys encoded using hex should not have 0x00's stripped or the remaining bytes UTF-8 encoded, just as the keys encoded by base32 don't.

Further investigation of $hotp and $totp shows me that it accepts spaces in hex strings as well as base32, as long as the length of the input string including the spaces is one of the lengths 40/64/128. I have yet to find any other program where spaces are included in the length prior to checking whether the string is of the correct length but then thrown away before decoding the key. I still can't find other programs using spaces in any manner other than to chop the 6 output numbers into 2 groups of 3 numbers, or chop a very long input string into groups of 4 base32 characters, where the space is never preced by anything except exactly 4 base32 characters. I can't find examples of spaces used with hex, but if it does exist i can't see it used in any way other than creating the same groupings of 4 digits for ease of data entry with fewer errors.

https://npm.taobao.org/package/authenticator is another url I've found using spaces mixed with base32, and again the spaces are preceded by exactly 4 base32 digits. The webpage uses the example key "acqo ua72 d3yf a4e5 uorx ztkh j2xl 3wiz" which is the base32 encoding of a 160-bit binary key. $hotp and $totp would use this as a literal case-sensitive text key instead of decoding to the 160-bit value, because the string including the 7 spaces has length 32+7=39, and isn't a multiple of 8.

-

I've been able to duplicate how $hotp calculates the output number for all inputs I've tested, except when the key parameter is length 40/64/128 and contains an odd number of hex digits and an odd number of spaces. What is the actual decoded binary key for a string of 39 0's and 1 space?

Because $hmac appends 0x00's to strings shorter than 512 bits, all $hotp keys consisting entirely of 0x00's yield the same output string.

Code:
//var %a 0000000000000000000000000000000000000000 | echo -a len: $len(%a) output: $hotp(%a,9,sha1,9) key: %a



The above key has length 40, so it decodes the key as if it's a hex string of all 0x00's, and returns the same output as 64 0's or 128 0's. If an EVEN number of 0's are changed to spaces while retaining the 40 length, the output remains the same because 20 0x00's and 19-or-fewer 0x00's are both padded to identical HMAC strings. However if an ODD number of 0's are changed to spaces, the output changes to something different, and remains that same different output regardless which is that ODD number of spaces or where those spaces are located within the key string. This tells me that $hotp is not handling the un-paired digit by prepending or appending a '0' to the string, or else 1 space with 39 0's would output the same as 40 0's. The output for 39-zeroes + 1 space is also not the same output from decoding a base-32 string of 39 0's either.

Other than the unknown handling of hex strings containing an odd number of spaces, my prior list of how $hotp and $totp strings are handled changes slightly to:

Rule #1. Current Rule:
a) if key is a string length 40 or 64 or 128
b) and is composed of hex digits and/or spaces
the spaces are removed, and the remaining pairs of hex digits are decoded to byte values 0-255. Value 0 is not added to the decoded key, but otherwise the UTF-8 encoded of each individual value is added to the binary key instead of the actual byte value.
i.e. key matches $regex(key,^[a-f A-F0-9]+$). This means a string of length 40/64/128 consisting entirely of spaces outputs the same value as when those strings consist entirely of 0's: //echo -a $hotp($str(0,40),9) $hotp($str($chr(32),40),9) $hotp($str(0,64),9)
As mentioned above, I'm unable to determine how these strings having an odd number of hex digits are handled.

Fixed Rule:
a) spaces should be removed before checking if string length is 40/64/128.
b) Spaces can only be present if preceded by exactly 4 hex digits.
c) when spaces are removed, string has 40/64/128 case insensitive hex digits.
i.e. key matches $regex(key,^([a-fA-F0-9]{4} )*[a-fA-F0-9]+$) and $istok(40 64 128,$len($remove(key,$chr(32))),32)
Hex strings should be decoded the same way base32 strings are handled, where bytes are not UTF-8 encoded and 0x00's are not stripped. Actual binary key will always be exactly 20 or 32 or 64 bytes in length.

Code:
//var %a 00deadbeef1234567890cafeface87654321bade | bset &v 1 $regsubex(%a,/(..)/g,$base(\1,16,10) $chr(32)) | echo -a hex string %a should produce binary key $bvar(&v,1-)



Rule #2. Current Rule:
a) if string doesn't matching #1, but is length 16 or greater that's a multiple of 8. i.e. can be length 40/64/128 if contains at least 1 of the base32 characters not present in the hex alphabet.
b) and is composed of case-insensitive base32 digits and/or spaces
i.e. key matches $regex(key,^[a-z A-Z2-7]+$).
If it matches both requirements, the spaces are stripped and the actual key is the base32 decoding of the remaining string regardless of length.
Rule#2 is not able to handle keys consisting entirely of space characters, even though Rules 1 and 3 can. I'm guessing this is caused by an internal passing of the key to $decode() which does not accept $null strings. $hotp($str($chr(32),N),9) is valid syntax as long as N is not a multiple of 8 greater than 8 other than lengths 40/64/128.
Rule #2 currently has the problem where all text sentences with spaces but having no punctuation, whose length is a multiple of 8, are falsely matched as if they're a case-insensitive base32 string.

Fixed Rule:
a) if string doesn't match rule #1, spaces should be removed before checking if string length is 16/26/32, or optionally also extended to all multiples of 8 above length 8.
b) Spaces can only be present if preceded by exactly 4 case-insensitive base32 digits.
c) when spaces are removed, string is only case insensitive base32 digits.
i.e. key matches $regex(key,^([a-zA-Z2-7]{4} )*[a-zA-Z2-7]+$)
It's debatable whether some false positives should be avoided by requiring the base32 string to be entirely upper or entirely lower case.

Rule #3. Current Rule: Any remaining string is used as the literal case-sensitive key, assumed to already be UTF8 encoded.

Fixed Rule: No change, except many strings change between being handled as hex/base32 encoded vs being handled as literal text.

-

To fix all these HOTP/TOTP problems and preserve backwards compatibility, there would need to be an additional switch to identify the type of key present. There are currently several ways where a key intended to be handled one way is handled another way. The most common ones are:
1. Text passphrase of length divisible treated as if case-insensitive base32 string:
Code:
 //var %a CuRiOsItY KiLlEd ThE CaT | echo -a $len(%a) $hotp(%a,1) $hotp($upper(%a),1) $hotp($lower(%a),1)

2. Base32 key divided by spaces into groups of 4 characters is treated as if case-sensitive text string because its length isn't a multiple of 8: //echo -a $len(aaaa bbbb cccc dddd)

The additional parameter would default to current behavior, but would allow a switch to force ambiguous keys to be handled correctly.

$hotp(key, count, hash, digits)
$totp(key, time, hash, digits, timestep)
->
$hotp(key, count, hash, digits, type)
$totp(key, time, hash, digits, timestep, type)

type 0 or parameter not present = backwards compatible current behavior

type 1 = key is &binvar

type 3 = Using fixed versions of rules 1/2/3 to determine how to handle the string.

type h = force handling as hex encoded regardless of length. Remove any spaces present, and return error if any non-hex-digits remain or if the number of hex digits isn't an even number greater than zero. Output includes 0x00's and is not translated from binary byte to UTF-8 encoding.

type a = force handling as base32 string regardless of length. Remove any spaces present, and return error if any non-base32-digits or non '=' padding remain.

type m or type u = same as type a, except handling as input to $decode(key,m) or $decode(key,u).

type t = force handling as Rule#3 case-sensitive literal text.

Edit: /help says key is required, but //echo -a $hotp(,9) is the same as $hotp($str(0,40),9)

Last edited by maroon; 06/08/18 09:33 PM.
Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
For reference when you look at this to-do list item, below summarizes my research and gives a solution to fix the issues I mentioned in the past, without needing to create a new parameter to define the key format.

+ eliminates the vast majority of incorrect guesses between text and base32 by skipping rare magic base32 key lengths

+ supports the most common usage of Google Authenticator base32 format, which is a length 39 string having 32 text + 7 spaces

+ offers a way for users to have 100% of all key lengths supported if using hex format

The proposed modified design:


1. LEN is the length of the key-parameter string excluding all spaces
2. if LEN is any of 40 64 128 256 && key is valid hex and spaces, then Keyformat = hex
3. elseif LEN is any of 16 26 32 (possibly also 103 205) && key is valid base32 and spaces, then KeyFormat = base32
4. else KeyFormat = text
5. If KeyFormat = hex then KEY = decoded from hex to byte leng LEN/2 without UTF8 encoding to text
6. If KeyFormat = base32 then KEY = decoded from base32 to binary, as is currently done
7. If KeyFormat = text then KEY = unaltered key parameter as currently done
8. If byte_length(KEY) > block_length(hashname), then KEY = binary hashname_digest(KEY)


Key points...

  • $hotp and $totp already correctly handle steps #6-8.
    .
  • Avoiding UTF8 encoding of hex keys makes hex keys compatible with everyone else.
    .
    Originally Posted by Khaled
    I based the design of $hotp()/$totp() on many of the real-world C/C++ examples, discussions, and examples I found. So, for example, UTF-8 encoding all of the formats was something that was common to the implementations I saw, so that is what mIRC does. UTF-8 encoding obviously breaks keys that include null bytes. So the question is, are null bytes actually allowed?

    UTF8 encoding breaks all keys having bytes outside the 1-127 range, except for the bytes in the trailing position which just get put right back in by HMAC padding short keys with 0x00's. Yes null are allowed, and in keys randomly generated by Google Authenticator and other programs trying to be compatible with it, null bytes are no more/less common than any other byte, and by random chance approximately 1 per 14 160-bit Google authenticator keys has at least 1 null byte.

    I have never found any implementation out there who takes a hex/base32 encoded binary key who then modifies it by UTF8 encoding it or altering it in any way.

    The ambiguity is caused by the unfortunate decision by the RFP author to use 20 text numerics as the 1 and only test vector, which made it less obvious that the key is a binary string where there are 2^N possible N-bit strings.

    You can google for 'totp generator online' and can find several sites that will return a 6 digit code for whatever base32 key you feed it, including xanxys.net/totp totp.danhersam.com or totp.app

    At xanxys they randomly generate keys, and you'll find one having a null byte after refreshing a few times. They present the binary byte string encoded as both hex and as the equivalent base32 encoding.

    If you input that base32 string into the danhersam template it returns the same digits whether the 'key' is with/without optional spaces, and if $totp is fed the same base32 string with the spaces removed, it also matches the danhersam results assuming clocks are synched. The base32 result is currently generated correctly by handling all the bytes including the null bytes as a binary string without modifying them into UTF8 text, so the hex equivalent key should be handled compatibly with the way base32 keys are.

    I cannot find any such discussions, code, or examples which involves hex or base32 keys containing bytes outside the 1-127 range then modifying them into UTF8 text. The closest I could find was discussions for programs like OpenSSL in discussions unrelated to TOTP, which were about the correct way to handle normal text when used as passphrases or how to interpret name or other text identifier fields. And there the alternative to UTF8 was the old way of everyone interpreting the text in their own local codepage.

    Here's something that's not just an online generator, but it's a real world implementation of TOTP which does not transform the key into UTF8 text nore strip 0x00 bytes. You can make an account at cservice.undernet.org which registers you with 'X', their roughly equivalent of nickserv/chanserv.

    In your account area in their website, you can enable an option where you'd need to use TOTP in order to login at their website and into your IRC account. Instead of scanning the QR image into your phone, you can click on 'enter your secret key manually' which gives a 160-bit Google Authenticator key that's been encoded as base32 then padded with 7 spaces to a length 39 string. If the 1st base32 key they give you doesn't contain an embedded 0x00 byte, you can halt the registration process and retry asking to setup TOTP a few more times until you're given such a key. With the following command in the editbox ready to press <enter>, it makes it easy to know when the clipboard contains the key containing the 0x00 byte. The spaces don't need to be stripped, because $decode base32 conveniently ignores spaces.

    //var %a $cb | bset -t &v 1 %a | echo -a $bvar(&v,0) : $decode(&v,ba) : $regsubex($bvar(&v,1-),/\b(0)\b/g,$chr(22) \t $chr(22))

    When you are given a key containing the null byte, you can take the length 39 string they give you, remove the spaces to make the length be 32, and feeding that to $totp will give the correct 6 digits if your clock is in synch.

    They won't enable TOTP mode until you can prove that you're able to answer with the correct 6-digit codes. And the only way to make Undernet happy is for the 6 digit code to be created the same way base32 is already handled, where the bytes outside the 1-127 range are not changed or stripped..

    And the hex keys should be handled in an equivalent manner to the base32 keys.
    .
  • By adding 256 as a magic hex length, the 128 and 256 hex-digit lengths can provide support for 100% of keys and keylengths for all hashnames including the longer block lengths belonging to SHA384/512.

    Because HMAC pads 0x00 bytes to keys shorter than the 64/128 block length used by the underlying hash, all hex lengths shorter than 128 hex digits which are being handled correctly at the 40 or 64 lengths, would also be handled identically when '0' digits are appended to make them be 128 digit hex. And for sha384/sha512, the same would be true for all hex keys for all hex lengths shorter than 256.

    Almost all hex keys created randomly or from a hash digest are going to have at least 1 byte outside the 1-127 range, so it's extremely unlikely that any of those random keys were not broken, and would have been compatible only between clients interpreting them the same way.
    .
  • By excluding spaces when checking for the 'magic lengths', this cuts down on the key lengths where text strings can be incorrectly guessed as base32. There's 3 Google Authenticator lengths, 16,26,32. And when they have the normal amount of optional spaces padding, that would add the lengths 19 and 37. Instead of having extra magic lengths that could provide additional collisions with text strings, excluding spaces from that count would only need to support the 16/26/32, and it wouldn't matter whether they used some/all/none/extra optional spaces.

    From what I can tell, the length 39 text string used by Undernet to present the base32 key is by far the most common way of presenting Google Authenticator keys.
    .
  • This elimination of all the undocumented multiple-of-8 base32 lengths greatly reduces the case-sensitive text+spaces passphrases that could be seen as if case-insensitive base32.

    [+/quote=Khaled]
    That is correct. The comments in my code state that it should check for >=16 and multiples of 8 instead of just 16/24/32. I cannot remember why as I implemented this feature three years ago and researched it at that time. You will need to research this, and the Google Authenticator format, yourself I'm afraid. [/quote]

    The person who you found stating it needs to be lengths 16/24/32 plus other multiple of 8 bytes was not correct on either. Yes, the lengths are >= 16 because that's the shortest base32 length used by Google Authenticator. But the G.A. lengths are 16/26/32 not 16/24/32, Because the 1st trio are the 3 lengths for the base32 encodings of bit lengths 80,128,160 which have been common key lengths in the past and present. On the other hand, 24 would be the encoding of 24*5=120 bits. While that's a valid key length as all byte lengths are, it wouldn't be something chosen by Google.

    Also, they should have been talking about multiple of 8 Bits not Bytes, and would have not even mentioned this topic except for the fact that 128 bit length encodes as a length 26 base32 string which has the capacity to hold 26*5=130 bits. But what they were tryig to say is nothing special, as that's simply the normal behavior that $decode already has when handling mime and base32 string, where the decoded string contains only bytes where all 8 bits had been encoded into the mime/base32/uuencode string, and does not extract any bytes where only some of their bytes were encoded.

    So, what they should have been saying was that the Google Authenticator handling of length 26 base32 strings should be the same way that $decode and all other decoders handle it, which is to see it as the longest byte string it can hold completely, 16*8=128 bits, not the longest bit string it can hold of 26*5=130 bits.
    .
  • Most of the undocumented multiple-of-8-byte base32 lengths would be rarely if ever used. Those 8x lengths would be the encoding of bit-lengths that are multiples of 40 bits, of which the 40x lengths most likely to be encoded as base32 are the 16-bytes=80-bits and 32-bytes=160-bits already used by Google Authenticator.

    Base32 is used little except at the G.A. lengths. The main reason they use base32 is because they wanted to help people hand-typing the key from 1 device to another, which would be less likely to be done interactively unless they know copypaste is available. They chose base32 because it's shorter than hex, and didn't use mime because base32 is case-insensitive. If not for that, they would have used hex which is built into compiler languages, and avoids the need to create tricky base32 decoders.

    If you wish to offer 100% cover for all possible keys encoded in the base32 format, you can still trim the magic base32 lengths down to just the 3 G.A. lengths 16 26 32, and from there you can optionally accept 103 and 205 as base32 lengths, which decode to the same binary lengths as above hex lengths 128 and 256. Users would then be able to take any base32 string that's already a multiple of 8 'characters' and that's shorter than 103, and all they'd need to do is just append enough "A" digits to reach 103, and for sha384/sha512 they can also do the same "A" exending of strings to length 205. Something like:

    //var %base32key DEADBEEF | if ((8 // $len(%base32key)) && ($len(%base32key) < 103)) var -s %base32key %base32key $+ $str(A,$calc($v2 -$v1))

    The only application I've seen using base32 instead of hex for keys longer than the Google-Authenticator strings having 32 digits - is at verifyr.com/en/otp/check where they have a template that generates random TOTP keys then lets you test them against the current time. They generate random keys that occasionally contain null bytes, and their base32 string has the 103 length above, because they decided to have the key length be the same binary length as hex 128 strings have.

    The only drawback I see from adding 103 and 205 as magic base32 lengths is that these might create collisions with word+spaces passphrases, but as long as /help warns of these lengths, they can either avoid them or make sure their passphrases use at least 1 character that's not base32.


You had previously posted...

Originally Posted by Khaled
This implementation should match the one used by OpenSSL. Have you checked with OpenSSL to confirm whether this identifier is or is not matching it? If it is matching it, it cannot be changed.

I have looked at current/prior OpenSSL version source code, and can find no evidence that HOTP or TOTP have ever been part of OpenSSL. People may be taking advantage of the HMAC built into OpenSSL to avoid writing their own HMAC function, but any OpenSSL involvement is being just a subroutine that takes whatever binary strings are passed to it then spitting back a hash digest, and the applications do their own truncating to create the digits.

Quote
Other similar issues.

Summary of a few other items I'd mentioned previously along with these issues

  • Digits being limited to 9

    I'm guessing the max of 9 was from some application deciding that 9 was enough. The RFC doesn't say 10 digits is invalid, it just warns that going from 9 -> 10 digits just doubles the strength instead of getting the 10x strength from adding the other digits.
    .
  • Timestep limited to 3600

    If the decision is to continue limiting to 3600, that's fine, but would be great if it were documented in /help and treated as if an invalid parameter. While for most cases the timestep window is kept at the default 30, there can be situations where the interval would be longer.

    RFC 6287 is OCRA authentication, and that uses HOTP/TOTP in a way which expects it to support as many as 10 digits and to support timestep as large as 48 hours They even have a digits=0 mode that returns the entire hex hash from HMAC instead of truncating it to a few numbers. But if $hmac is able to accept binary keys directly, it would be simple to get that full digest directly, as the complicated part is making the 6 digits.

    If users know about the 3600 limit, they can do an easy workaround by setting timestep=1 then do something similar to this one that changes the code at 9am each day.

    $totp(key,$calc( ($ctime -$timezone -3600*9)//86400),sha1,6,1)

Joined: Jan 2004
Posts: 2,127
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 2,127
Great, looks like base32 Google Authenticator is working with/without space padding, but a few cases remain:

Quote
digits=10:

//var -s %count 7184224 , %key abcdefghijklmnopqrstuvwxyz234567 | echo -a $hotp(%key,%count,sha1,6) vs $hotp(%key,%count,sha1,9) vs $hotp(%key,%count,sha1,10)

result: 421874 vs 147421874 vs 0737356466
the 10 digits value should be 2147421874

It looks like the truncate31bit is being held in a uint32 which overflowed from division by 10^10, as the result here is $calc( 2147421874 % (10^10 % 2^32 ) )

Probably the simplest way to handle it is like:

truncate31bit = math
if (digits < 10) truncate31bit = truncate31 % 10^digits

Quote
timestep > max

//var %ctime 1234567890 | echo -a $totp(key,%ctime,sha1,6,$calc( 3600*48 )) vs $totp(key,%ctime,sha1,6,$calc( 30 )) vs $totp(key,%ctime,sha1,6,$calc( 3600*48 +1))

result: 904921 vs 336986 vs 336986
should be 904921 vs 336986 vs invalid parm

Defaults are for when a parameter is not used, and just like $hotp(key,0,foobar,11) reject the hashname and digits if not within the accepted range, timestep outside 1-172800 should be an error instead of replacing with 30

Quote
Stripping 0x00 bytes from hex keys

It appears that hex keys are no longer encoding bytes 128-255 as UTF8 text, but are still stripping 0x00 bytes. The example below generates identical %base32 and %hexhex results when the bytes are encoded as either hex or base32, and results match the totp.danhersam.com online template. However if changing 1-or-more of the 1st 19 bytes to 0, the base32 encoding continues to match the template, but %hexhex instead matches the %stripd key where the 0x00 are moved to the tail.

Fixing the handling of bytes 128-255 made 13/14 of random 160-bit hex keys be compatible, but having keys with 0's in different locations being treated as equivalent means that it's something less than 2^160 possible keys, as well as not having compatible results for 1/14th of 160-bit keys.

Code
alias totp_hex_zeroes {
  bset &v 1 195 169 233 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220
  var -s %hexhex $regsubex($bvar(&v,1-),/(\d+)/g,$base(\t,10,16,2)) , %stripd $remtok(%hexhex,00,0,32)
  while ($numtok(%stripd,32) < 20) var %stripd %stripd 00 | noop $encode(&v,ba)
  var %base32 $remove($bvar(&v,1-).text,=)
  echo -a base32 $totp(%base32) %base32
  echo -a hexhex $totp(%hexhex) %hexhex
  echo -a stripd $totp(%stripd) %stripd
}


Link Copied to Clipboard