Register Log In

Forums Scripts & Popups Challenge for mIRC Gurus (string search)

Print Thread

Page 1 of 2

1

2

Challenge for mIRC Gurus (string search) #86462 11/06/04 10:49 PM
Joined: Jun 2003 Posts: 19 S sahmed01 OP Pikka bird
OP sahmed01 Pikka bird S Joined: Jun 2003 Posts: 19	HI Here is what I have: (profanity10 is an Alias name) on :text::#:{ if ( $istok($strip($1-),penis,32) { /profanity10 } I like to catch it even if the word penis is entered with any punctuation or special charecter. ie: pnis, Pn/i*s etc. Please help One solution in my mind is to remove any charecter from $1- before evaluating it but how do I do it....?Plz Help How come I can't find any good book on mIRC Scripting? mIRC help file is not good at all for new commer like me. Also how do you go through all the items in an array (hash tbl) how do you know the lower/upper bounds of an array in mIRC? Thanks

Re: Challenge for mIRC Gurus (string search) #86463 11/06/04 10:56 PM
Joined: Nov 2003 Posts: 2,327 T tidy_trax Hoopy frood
tidy_trax Hoopy frood T Joined: Nov 2003 Posts: 2,327	if penis iswm $strip($1-) { } New username: hixxy

Re: Challenge for mIRC Gurus (string search) #86464 11/06/04 11:46 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	For the first question the answer is a regular expression: Code: if ([color:red]$regex($1-,/(?:[^\w\s]\|_)+p(?:[^\w\s]\|_)+e(?:[^\w\s]\|_)+n(?:[^\w\s]\|_)+i(?:[^\w\s]\|_)+s(?:[^\w\s]\|_)+/S)[/color]) { profanity10 } Note: There's no need to use $strip() around $1- in this case, because it's done by the regular expression when the /S modifier is used. As for hash tables, to use them efficiently you should use them by their item names instead of by their indexes. However, sometimes using the indexes is necessary, in which case the lowest index will always be 1 and the highest can be found with $hget(hash_table_name, 0) Last edited by starbucks_mafia; 11/06/04 11:54 PM. Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86465 11/06/04 11:53 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	Using that would get a lot of false positives: Saying 'Presenting the world famous Flying Arnie' or anything else with the characters p, e, n, i, and s in that order would trigger it. Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86466 12/06/04 12:11 AM
Joined: Nov 2003 Posts: 2,327 T tidy_trax Hoopy frood
tidy_trax Hoopy frood T Joined: Nov 2003 Posts: 2,327	Yeah, i read it wrong, i presumed he meant p e n i s could be anywhere in the word, i guess he meant p!",\/e,*nis or similar New username: hixxy

Re: Challenge for mIRC Gurus (string search) #86467 12/06/04 12:24 AM
Joined: Jan 2003 Posts: 2,523 Q qwerty Hoopy frood
qwerty Hoopy frood Q Joined: Jan 2003 Posts: 2,523	For this purpose, the relatively new subroutine feature of PCRE may come in handy: Code: /((?:[^\w\s]\|_)+)p(?1)e(?1)n(?1)i(?1)s(?1)/Si /.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com

Re: Challenge for mIRC Gurus (string search) #86468 12/06/04 12:29 AM
Joined: Jun 2003 Posts: 19 S sahmed01 OP Pikka bird
OP sahmed01 Pikka bird S Joined: Jun 2003 Posts: 19	HI I thank you very much for your quick and wise response. I am sorry if I was not able to make it clear. I guess what I need is to catch any punctuation or special charecters between the word penis.... ie: p*e^n$i@s/ etc. If I can only catch those and leave letters a-z alone, it would serve the purpose. And also it would greatly help and I would be very thankful if you can explain $regex line and all the charecters used and their purpose..... I thank you very much. Thanks

Re: Challenge for mIRC Gurus (string search) #86469 12/06/04 12:13 PM
Joined: Feb 2004 Posts: 714 Brasil Z Zyzzyx26 Hoopy frood
Zyzzyx26 Hoopy frood Z Joined: Feb 2004 Posts: 714 Brasil	If you want it to detect when one single character is between the words, you can also use this: Code: if (* p?e?n?i?s * iswm $strip($1-) { commads } This way it will only detect the word when a character is between the letters, like p.e.n.i.s or p-e-n-i-s and so on.. Hope this helps Zyzzy. Last edited by Zyzzyx26; 12/06/04 12:14 PM. "All we are saying is give peace a chance" -- John Lennon

Re: Challenge for mIRC Gurus (string search) #86470 12/06/04 10:17 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	Thanks, I had a feeling that it was possible but I couldn't remember how. Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86471 12/06/04 10:51 PM
Joined: Apr 2003 Posts: 701 Leuven, Belgium K Kelder Hoopy frood
Kelder Hoopy frood K Joined: Apr 2003 Posts: 701 Leuven, Belgium	Thanks for telling (or reminding ) us that subroutine thingy Just one suggestion for the regex: /((?:[^\w\s]\|_)*)p(?1)e(?1)n(?1)i(?1)s(?1)/Si This matches p--°en__.;is and things like that too

Re: Challenge for mIRC Gurus (string search) #86472 12/06/04 11:31 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	OK here's the updated version (using qwerty's adaptation plus another change so it matches properly): Code: if ($regex($1-, /((?:[^\w\s]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si)) { profanity10 } Now to try and explain the regular expression /((?:[^\w\s]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si... First off I'll point out that my 'explanation' almost certainly won't explain anything unless you already know regular expressions at least a little bit, and even if you do know regular expressions it probably still won't explain anything. Regular expressions are a very powerful tool, unfortunately they're incredibly hard for people to understand and trying to explain a fullblown expression to someone who doesn't already know regular expressions in general is not a good idea. You're better off using google to find a regular expression tutorial and then coming back and reading my explanation when you're comfortable with them. Anyway, here it is: [color:blue]/((?:[^\w\s]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] The /'s there mark the beginning and end of the actual expression. /((?:[^\w\s]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/[color:blue]Si[/color] Since these are outside of the /'s they are treated as modifiers. Modifiers are like switches - they change the behaviour of how the entire expression behaves. The S modifier means that control codes are stripped from the text before it's compared (which is why we don't need to use $strip()). The i modifier means that the match is case-insensitive - this means that it will match 'penis', 'pENIS', 'p&En>iS' or any other variation on letter-case. /[color:blue]((?:[^\w\s]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] The parentheses (( )) create a subpattern, which is used to group the expression and also means that anything matched with the expression within is captured and can be retrieved and used later. /([color:blue](?:[^\w\s]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] These inner parentheses are again used to create a subpattern to group the expression within them, however the ?: after the opening parenthesis means that what's inside is not captured, this makes the expression more efficient since we don't need to retrieve what that expression matches. /((?:[color:blue][[color:red]^\w\s][/color]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] The brackets ([ ]) are used to create a character set, it basically means that it will match if any of the characters it contains appears at that position in the text. However the ^ means that the character is negated, this means that the character set will match any characters that it doesn't contain. \w and \s are metacharacters, they both represent groups of characters (kind of like special built-in character sets). \w represents the characters a to z, A to Z, 0 to 9, and underscore (_). \s represents all whitespace characters such as regular space, tab, and so on. So in total [[color:red]^\w\s][/color] means 'match a character which is not alphanumeric, an underscore, or whitespace'. /((?:[^\w\s][color:blue]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] The \| basically means 'or'. That is, match either the expression to the left or the expression to the right. So in the wider context of this particular expression it means 'match [color:green][^\w\s] (which in turn means 'a character which is not alphanumeric, an underscore, or whitespace') or match _ (which is a literal underscore)[/color]. To put that into a single sentence, it means 'match a character which is not alphanumeric or whitespace'. /((?:[^\w\s]\|_)[color:blue])p(?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] The is a repetition quantifier. It means 'match the expression preceding it zero or more times' (any number of times). The expression directly preceding it is (?:[^\w\s]\|_) (this is why the parentheses were used to group that expression), so to combine these two meanings we get 'match any number of characters which are not alphanumeric or whitespace'. /((?:[^\w\s]\|_))[color:blue]p(?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] Each of those characters are taken literally, and you can replace them with any alphanumeric characters you want to match any word you choose. /((?:[^\w\s]\|_))p[color:blue](?1)e(?1)n(?1)i(?1)s(?1)/Si[/color] Each of those (?1)'s simply means 'apply subpattern number 1 (the first subpattern defined) here' - the first subpattern being ((?:[^\w\s]\|_)). Basically this means that the expression behaves as if ((?:[^\w\s]\|_)) was used in each of those places. You probably didn't learn anything from that, but it took me an age to write out so just look at the pretty colours anyway. Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86473 12/06/04 11:36 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	This is one of the few times where I'm tempted to say 'LOL'. Thanks for the suggestion, as you can see from my latest (very very long) post I've used the * repetition quantifier in the latest expression. Believe it or not I actually did that before I saw your post, it's just taken me almost an hour and a quarter to write out the entire regex explanation . Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86474 12/06/04 11:52 PM
Joined: Feb 2004 Posts: 2,019 Leuven, Belgium FiberOPtics Hoopy frood
FiberOPtics Hoopy frood Joined: Feb 2004 Posts: 2,019 Leuven, Belgium	Quote: You probably didn't learn anything from that, but it took me an age to write out so just look at the pretty colours anyway. LOL. Well even if he didn't understand much, it's still useful for those interested in learning about Regular Expressions. And I think it's a good habit to explain the code written for other users, I usually try to aswell. And yes, the colors are also pretty to watch :tongue: Greets Gone.

Re: Challenge for mIRC Gurus (string search) #86475 13/06/04 12:23 AM
Joined: Nov 2003 Posts: 2,327 T tidy_trax Hoopy frood
tidy_trax Hoopy frood T Joined: Nov 2003 Posts: 2,327	Couldn't [^\w\s] be: \W\S New username: hixxy

Re: Challenge for mIRC Gurus (string search) #86476 13/06/04 12:52 AM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	No because that would mean 'non-alphanumeric character followed by a non-whitespace character'. It would have to be written as (?:\W\|\S). If you mean could it be written as [\W\S] then that would match every character. If you imagine that \W is replaced with all characters that aren't alphanumeric (which means all whitespace and punctuation characters) and then replace \S with all non-whitespace characters (which would include all punctuation and alphanumeric characters) then you're left with all characters inside the character set (including punctuation characters twice). Is it me or did I just completely overuse the word 'characters'? Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86477 13/06/04 09:31 AM
Joined: Apr 2003 Posts: 701 Leuven, Belgium K Kelder Hoopy frood
Kelder Hoopy frood K Joined: Apr 2003 Posts: 701 Leuven, Belgium	/((?:[^\w\s]\|_)*)p(?1)e(?1)n(?1)i(?1)s(?1)/Si This part means everything but letters, digits and whitespace. Here's some other ways to do it : [^[:alnum:]\s] [^a-z0-9\s] Notice I didn't include A-Z since the i modifier is enabled anyways [^[:alpha:]\d\s] [^[:alpha:][:digit:][:space:]] Well, [:space:] and \s are not identical, but it's close enough for this I removed the (?: \|_) part, character classes are normally faster than the \| or... For more info about these: search internet for pcre.txt (that's the regex library used in mIRC). It contains a lot of unneeded info, but all the useful stuff is in there too for the [:blah:] stuff search the file for POSIX CHARECTER CLASSES PS: starbucks_mafia, allow me to say yipes@1h15m

Re: Challenge for mIRC Gurus (string search) #86478 13/06/04 10:06 AM
Joined: Jan 2003 Posts: 2,523 Q qwerty Hoopy frood
qwerty Hoopy frood Q Joined: Jan 2003 Posts: 2,523	Very good points Character classes are indeed faster than alternating subpatterns, as long as the two patterns are comparable in length (remember that mirc has to parse the string before passing it to PCRE). I guess the fastest alternative would be ([^a-z\d\s]*) /.timerQ 1 0 echo /.timerQ 1 0 $timer(Q).com

Re: Challenge for mIRC Gurus (string search) #86479 15/06/04 08:57 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	A little late with my reply here but I thought I might aswell say this. The reason I used \w instead of a-z0-9 is that I'm more used to writing regexes for languages where there are many possibilities for the character set in use. Typically \w will respond according to the character set and use the appropriate characters that are defined as alphanumeric, whereas a-z0-9 is always just those 36 characters. Of course this doesn't make much difference in mIRC at this point, it was just a force of habit, however I guess it doesn't hurt to have some future-proofing put in for when Unicode is supported by mIRC. Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86480 16/06/04 03:39 PM
Joined: Jun 2003 Posts: 19 S sahmed01 OP Pikka bird
OP sahmed01 Pikka bird S Joined: Jun 2003 Posts: 19	Hi Thank you all very much for your help especially starbucks_mafia for spending so much time in explaining $regex. I used it but it doesn't work for my purposes. Your code: if ($regex($1-, /((?:[^\w\s]\|_))p(?1)e(?1)n(?1)i(?1)s(?1)/Si)) { profanity10 } will catch even someone who spelled SPANISH as SPENISH wich is not acceptable. I came up with a solution with my limited knowledge of mIRC. May be it will help someone lese like me. %ln = $1- if $istok($strip($remove(%ln,,.,%,/,\,+,-,_,@,!,$,^,~,`,)),penis,32) { profanity10 } The above line works fine for my purposes. I have no idea why $remove will not take $1- directly so I had to assign it to %ln. I had quite a few bad words to manage and it was really hard and time consuming to type all of them so I created a small .EXE to generate the code.. If someone needs the .exe for code generation, please let me know sahmed01

Re: Challenge for mIRC Gurus (string search) #86481 16/06/04 04:00 PM
Joined: Dec 2002 Posts: 2,962 Norwich, UK S starbucks_mafia Hoopy frood
starbucks_mafia Hoopy frood S Joined: Dec 2002 Posts: 2,962 Norwich, UK	Ahh yes, good point. This should work correctly now: Code: if ($regex($1-, /[color:red](?:^\|\s)[/color]((?:[^\w\s]\|_)*)p(?1)e(?1)n(?1)i(?1)s(?1)[color:red](?:\s\|$)[/color]/Si)) { profanity10 } The two bits in red have been added. They simply check that there is a whitespace character or the beginning/end of the string at the beginning/end of the word respectively. Spelling mistakes, grammatical errors, and stupid comments are intentional.

Re: Challenge for mIRC Gurus (string search) #86482 16/06/04 04:22 PM
Joined: Jun 2003 Posts: 19 S sahmed01 OP Pikka bird
OP sahmed01 Pikka bird S Joined: Jun 2003 Posts: 19	Hi Thank you starbucks_mafia, I will test the code and report back soon. sahmed01

Page 1 of 2

1

2

Link Copied to Clipboard