this result isn't actually limited to $regsubex; it affects all functions in mIRC that implicitly decode UTF-8, eg:
//bset -t &a 1 $chr($base(D800, 16, 10)) | echo -a $len($bvar(&a, 1-).text)
= 3
the internal UTF-8 decoding function won't touch unpaired surrogates. would tweaking this be encroaching on violating the sanctity of unicode? clearly there is invalid UTF-8 being represented at some level, so perhaps having it decoded as well as possible isn't such a tall order?

btw, $regsubex() needs to encode (and later decode) the substitution parm in order to play nice with offset positions returned by PCRE (which only handles UTF-8 encoded strings). this seems necessary, and the observed bug is an unfortunate side effect.