I rather expected your answer (multiple conversions, UTF encoding, double-byte string type limitations).

I wonder if it's possible to identify an unpaired surrogate, wait for other surrogates, then process UTF on the sequence when it's completed. I think up to 8 byte characters are possible.


Well. At least I won lunch.
Good philosophy, see good in bad, I like!