At the risk of getting shot down for resurrecting this topic, I was wondering if anyone can shed any light on it as the problem/quirk still exists after 2.5 years!

To shed some more light on exactly what the hell is going on, I've made the following discoveries, each of which just makes it more mysterious.

  • aUabc,/a(\k)abc/ matches but aUabc,/a(\k)abcd/ does not, so it clearly takes account what follows the (\k)
  • aUabc,/a(\k)gabc/ matches but /a(\k)gabcd/ does not. It seems to match SOME of the expression after the (\k)! It would seem like it matches one or more characters from the end of the expression after (\k).
  • It gets even more unusual! aUooabcoo,/a(\k)gabc/ matches too! this would imply that one or more characters from the end of the string after (\k) can match at any position after the U in the input.
  • Now even more trippy. aUooabcoo,/a(\k)ga(*F)bc/ also matches. If the PCRE engine had ever evaluated the a(*F)bc, it would fail (since (*F) automatically causes a failure).
  • And what about this? aUabc,/a(\k)ab[zz]/ DOES match, but aUabc,/a(\k)ab[z]/ DOESN'T!


The last bullet point in particular I just cannot explain. If anyone has any other observations or comments, please post.

I've tried this in PHP, which uses PCRE, but doesn't exhibit the same behaviour and just matches a 'k' as you'd expect.

I did wonder if this may be something to do with the pre-processing that I mIRC must do before things are passed to the PCRE engine, for example stripping the colour codes with /S and the extra \co \cb escapes, which would mean it was a mIRC bug.

Last edited by Chessnut; 05/04/11 11:57 PM.