Quote:
At any stage in the method I described, there is a pointer in the text to match (in my case what is right of the iswm operator), everything before the pointer is 'matched', so why would it need to re-use text that is before that pointer position? Let alone so many times that mIRC needs 5 sec to know a 44 characters wide wildcard text does not match a literal text?
Since we now know that the algorithm is greedy, we know it *has* to reuse text before the pointer, ie it has to backtrack (I hope that much is clear by now).

An interesting question however is: could your non-greedy idea work in general? Consider the case where
pattern = *X*X
input = AXBXCZ
According to the algorithm you described, the first * would match A (the substring until the first X in input). Then the second * would match B (the substring from the previous position to the next X). So we've gone through the entire pattern without encountering any mismatches. Does that mean we can return with "success"? Clearly, *X*X doesn't match AXBXCZ.

However, there's a way around this: check whether the last *-delimited token in the pattern is a suffix of the input, which is essentially what I said before about 'early termination', PCRE etc. A similar trick could be applied for the first *-delimited token: mirc could check if it's a prefix of the input. So by changing the algorithm to
1) be non-greedy
2) perform prefix/suffix checks (as above)
its running time would greatly improve.

Unless we're both missing a special case where all this would fail, this may be a good idea indeed. smile


/.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com