This might relate to one of these threads

When not using //u, $regml().pos and $regmlex().pos report the position increased by +1 or +2 if the match is the 2nd or 3rd character of a utf8 encoding char. This can make a later byte in the string report the same/earlier position than an earlier byte. The older thread seems to indicate that .pos originally returned the byte position within the utf8 string, but was adjusted to return the $pos() position. If it's undesirable for all encoding bytes of the same character to return the same position, there might be usefulness in a .utfpos property.

//var -s %a $chr(233) $+ foo $+ $chr(10004) $+ bar | echo -ag $regex(foo1,%a,/(f)/) pos: $regml(foo1,1).pos vs $regex(foo2,%a,/(\xa9)/) pos: $regml(foo2,1).pos and $regex(foo3,%a,/(\x94)/) pos: $regml(foo3,1).pos vs $regex(foo4,%a,/(b)/) pos: $regml(foo4,1).pos | var -s %b $regsubex(foo,%a,/(.)/g,$base($asc(\t),10,16,2) $+ $chr(32)) | var %i 1 , %string | while (%i <= $regml(foo,0)) { var %string %string match# %i $gettok(%b,%i,32) $regml(foo,%i) is at pos: $regml(foo,%i).pos | inc %i } | echo -a %string

1 pos: 2 vs 1 pos: 2 and 1 pos: 7 vs 1 pos: 6

match# 1 C3 is at pos: 1 match# 2 A9 is at pos: 2 match# 3 66 is at pos: 2 match# 4 6F is at pos: 3 match# 5 6F is at pos: 4 match# 6 E2 is at pos: 5 match# 7 9C is at pos: 6 match# 8 94 is at pos: 7 match# 9 62 is at pos: 6 match# 10 61 is at pos: 7 match# 11 72 is at pos: 8