mIRC Home    About    Download    Register    News    Help

Print Thread
#201595 01/07/08 04:11 AM
Joined: Jul 2008
Posts: 236
S
s00p Offline OP
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Jul 2008
Posts: 236
I'd like to express my opinion regarding your implementation of regular expressions.

1. PCRE is quite possibly the poorest choice you could make, considering there are chances that the regular expressions used could be quite complex, and over 30 characters in length. See this site for benchmarks regarding the Perl regular expression engine (which is very similar to the PCRE engine): http://swtch.com/~rsc/regexp/regexp1.html (and note the difference in Y-axis scale in the first graphs) ...

2. Despite what you think, it is in your best interest to provide documentation for your product, so developers find it easier to migrate from other programming/scripting languages. Simply stating "It is beyond the scope of this help file to explain how Regular Expressions work. There are many websites on the internet that introduce regular expressions and provide examples." is ridiculous. What if someone assumes they can use a different flavour? At the very least, provide a link to a tutorial for P[expletive removed] C[expletive removed] R[expletive removed] E[expletive removed] in your next release?

edit: url wasn't working. I blame the trailing elipsis.

Last edited by Khaled; 01/07/08 03:38 PM.
s00p #201618 01/07/08 01:39 PM
Joined: Aug 2007
Posts: 334
Pan-dimensional mouse
Offline
Pan-dimensional mouse
Joined: Aug 2007
Posts: 334
i highly doubt its gona be changed because many people already know PCRE and changing it would screw up everything, like backwards compatibility, and people would have to relearn it
tutorial for regex: www.regular-expressions.com

Last edited by foshizzle; 01/07/08 01:40 PM.

This is not the signature you are looking for
s00p #201628 01/07/08 04:39 PM
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
So what exactly is your proposed solution here? The link you referenced doesn't provide one, it just says "PCRE is slow and it could be faster, theoretically".

I'm not sure where you rooted your hatred of PCRE, but hopefully it wasn't that *one* link that convinced you. The regular expression matching used by Thompson in grep/ed may be slightly faster (it's really not "a million times" under 90% of datasets, that's just garbage) but it's far more simplistic. About 50% of the Regular Expressions I write in any language involve lookarounds, and 95% of them involve backreferencing. The page claims little to no support for either. Here is a quote from the page you linked describing how a Thompson NFA handles backreferences:

Quote:

As mentioned earlier, no one knows how to implement regular expressions with backreferences efficiently, though no one can prove that it's impossible either. ... The simplest, most effective strategy for backreferences, taken by the original awk and egrep, is not to implement them.


That is a) stupid (for lack of any better term to get such an obvious point across) and b) a complete deal breaker for just about about everybody using regular expressions in mIRC.

Another thing to remember here is that PCRE may be "1 million" times slower than Thompson NFA's, but mIRC is probably 1000 times slower than PCRE, so the bottleneck is definitely not the regular expression engine and shouldn't be something of too much concern.

Finally, mIRC could link to a tutorial, but typing "PCRE tutorial" in google gives you plenty. I'm not sure why this is such a cause for frustration. Most developers should know how to google, and this is about one of the easiest searches possible.

PCRE is indeed out of the scope of mIRC's docs, and the best place to keep up to date documentation for PCRE is.. from PCRE. mIRC makes it very clear which library is being used, so the logical thing to do is to visit PCRE's website to get documentation on the library. Again, this is guaranteed to be far more up to date and of higher quality than anything Khaled could maintain.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
argv0 #213058 16/06/09 04:41 PM
Joined: Jul 2008
Posts: 236
S
s00p Offline OP
Fjord artisan
OP Offline
Fjord artisan
S
Joined: Jul 2008
Posts: 236
Quote:
The regular expression matching used by Thompson in grep/ed may be slightly faster (it's really not "a million times" under 90% of datasets, that's just garbage) but it's far more simplistic.


I'm pretty sure Thompson didn't use his NFA algorithm in grep/ed.

Quote:
So what exactly is your proposed solution here? The link you referenced doesn't provide one, it just says "PCRE is slow and it could be faster, theoretically".


Boost.Regex has a functionally equivelant, and a considerably faster algorithm, when compared to Perl's regular expressions. A solution doesn't necessarily involve entirely replacing things, either.


Link Copied to Clipboard