Honestly, I'm not fully comprehending the nature of your post. Could you give me the tweet-length summary?
> Each Unicode codepoint above 2047 is UTF-8 encoded as 3 bytes, so encoding such strings can potentially have a MIME'ed string be 4x as long as the $len of the original text string
Seems copacetic to me.