Hello, some feedbacks on this.
It would be helpful to add a parameter to $utfencode in order to make it not truncate, this is actually a problem in script because $utfencode is the function you need to use, to know the utf8 length, to see if it exceed a limit or not.
Something like $utfencode(input,[charset],[%var|&binvar]) returning the correct length of the utf8 string when %var|&binvar is used, and copying only the maximum length allowed into the output %variable (and all of it if a binvar of course)? This would allow us to not only get the real length, but we would now have the option as well to use a binvar to access all the bytes regardless of the length.
There is a note in versions.txt
saying that binvar are no longer limited on the number of byte that can be stored, one of the advantages of binvar in mIRC is to overcome the limit on characters for a line/parameter/etc.
Take /bset in maroon's example, $str() is evaluated correctly, because 6000 characters is ok, his bset comes with -t, and it can be extremely easily argued that therefore we're dealing with string according to your definition, so the limit should be in characters.
Now you could be arguing that no, this is a binary variable command and the -t switch does not override it.
That would be fine, except it's against the principle of allowing them to hold any number of byte. If binvar can hold as much as we want, why is bset silently limiting the number of byte added? If it makes perfect sense for binvar to hold that much bytes, then as long as $str resolves and the total line length in character is not beyond $maxlenl+100 (the current real limit), I don't see why maroon's bset would fail. To me that's simply a bug of /bset. The expected result can be achieved via two /bset, I don't see why not with one.
$sha256 and the like, being identifier, I also don't understand why they would chop at the limit, the result returned won't exceed any limit.
It is extremely unpleasant to not get an error because, and especially with $maxlenl changing over time, you have no idea that only that much has been used, and it will just cut an utf8 char in 'half' (is the case in maroon's example when $maxlenl+100 = 10340).
$regsubex suffer from the same problem and it's not very nice either: //noop $regsubex(foo,$str($chr(10004),6000),,) gives line too long for $regsubex, despite $str() being fine and the result being $null
It has to be said, $sha* etc and $regsubex are not binary function as far as the scripter is concerned.
I know that the common ground is converting to a single byte array, I just don't think it's necessary to apply the limit there, the scripting engine itself should be enough to handle that.
From my experience with msl (but it's certainly true for custom alias) any identifier parameter is limited to a maximum length of $maxlenl+Leeway (Leeway being 100 atm).
Of course there are exceptions, like $len, but $sha* family should all be exceptions as well.
And, if $len has no check on string length, $len can actually never return more than $maxlenl+Leeway, if you pass more than that as plain text to $len, the scripting engine stops and return $len: invalid parameter, but this is certainly not a limit on $len itself, just the engine parsing plain text parameter, I assume. And non plain text, well, you're limited to $maxlenl+Leeway anyway or you'll get a line too long error. The same applies to $sha256.
That gives us a limit already on the number of bytes that can be written to the single byte array internally from such identifier call: the above x 4.
$maxlenl being 10240 for now, that's a limit of around 40kb from the msl engine itself, and that's a memory that is released immediately after the call (might be a bit different memory wise for $regsubex, but for most identifiers like that, I believe the memory is released immediately), I don't think mIRC is in any danger.
All in all, with Unicode and the future, 64 bits, I don't think the internal limit on byte for such function requiring conversion is making it much safer for mIRC, rather we get stuck on what we feel should be working.
Of course mIRC needs some kind of limit, and it's nice to have it extended etc, but it makes sense to have a limit on the number of characters only in our script, since binvar are not limited.
I believe this limit made from the parameter length in character is enough.
Based on the above, I would like to see the limit removed for all identifiers applicable like that, because we can always use a binvar directly with the identifiers themselves (again, when applicable, I didn't check but ideally, it would always be applicable, but does work with $sha* family and via $bfind().regex for regex identifiers.)