mIRC Homepage
Posted By: bloodfog Hash Tables & Regex Question - 23/04/05 03:53 PM
Hello,
I'm new to hash tables, I know to use hadd hdel $hget, but what should i do to save my hash table on exit and set it on start? I mean all my data will be saved in a hash table, and I want it to be always there -without using -m flag- cos it'll be always there.

What's the best way to use regex for bad word detection, I mean $regex($1-,badword1|badword2) is just too simple and sometimes it is not working as it is supposed to do? What would you use for your bad word detection?

Hope you get me.
Thanks.
Posted By: JAFO Re: Hash Tables & Regex Question - 23/04/05 04:01 PM
For the hash file use 2 events.
An on start event using the /hload command.
And an on exit event using the /hsave command.
Theyre both in the help file and really easy to use.

/hload -sbni <name> <filename> [section]
/hsave -sbnioau <name> <filename> [section]
Load or save a table to/from a file.

Sorry, but i dont think i have ever used regex.
Posted By: SladeKraven Re: Hash Tables & Regex Question - 23/04/05 04:03 PM
On start using hmake if the hash file exists. Then load the data into the hash table.
Posted By: bloodfog Re: Hash Tables & Regex Question - 23/04/05 04:12 PM
Uhm, so It'll look like this?

on *:start:{ hmake settings 100 | hload settings settings.hsh }
on *:ext:{ hsave -o settings settings.hsh }

And what about my 'n00b' regex question? ;[
Thanks guys.
Posted By: SladeKraven Re: Hash Tables & Regex Question - 23/04/05 04:22 PM
Code:
On *:Start: {
  hmake settings 100
  if ($isfile(settings.hsh)) hload settings settings.hsh
}

On *:Exit: { hsave -o settings settings.hsh }



As for the $regex() I've never used it so I'm not too sure what to do on that front, what I can suggest is do a search for a $regex() keyword. Expand to 5 years, and if you haven't found what you're looking for maybe someone would have posted by then. smile
Posted By: qwerty Re: Hash Tables & Regex Question - 23/04/05 05:51 PM
The reason your regex doesn't always work can be either (or all) of the following:

- some of the bad words contain characters special to regex, like { or } or \ etc

- you didn't use regex quotes around the pattern, ie /badword1|badword2/i. Omitting quotes can fail if the first badword starts with "m" or if the said bad word contains capital letters (regex is case-sensitive unless you tell it otherwise, with the "i" modifier).

If I had relatively few bad words to watch for, I'd probably do it like this:
Code:
on @$*:text:$($+(/\Q,$replacecs(%badwords,\E,\E\\E\Q,$chr(44),\E|\Q),\E/iS)):#:{
  ban -k # $nick 2 Don't swear
}

This is similar to your regex way but also takes into consideration the things I said above. Note the %badwords variable; this should contain all bad words (or even phrases), separated by a comma. You'd set that variable once with
/set %badwords word1,word2,some phrase1,word3

If I had many words (more than 50), I'd use the technique described here. It's the fastest possible way of checking a line of text against many bad words. One thing only: $hmatch(swear_words,$strip($1-)) in that tutorial can be replaced by $hfind(swear_words,$strip($1-),1,[color:red]W)[/color] (capital W). Both work (at least for now), but $hmatch() is obsolete. The new version is $hfind(), so you'd better get used to using this instead.
Posted By: Kelder Re: Hash Tables & Regex Question - 23/04/05 11:21 PM
For the regex stuff: badword detection is hard, adding some characters, typo's, accented letters instead of normal ones, ... and any script fails

A pretty easy way would be $regex($1-,/\b(?:badword1|badword2|badword3)\b/iS)
This just checks if any of those words are in there, but not inside another word to prevent some false positives

If you want to replace them with *** or something:
var %newtext, %i = $regsub($1-,/\b(?:badword1|badword2|badword3)\b/igS,***,%newtext)

If you want to match the number of stars (make sure * is NOT a badword grin )
var %newtext = $1-
while ($regex(%newtext,/^(.*?)\b(badword1|badword2|badword3)\b(.*+)$/iS)) {
var %newtext = $+($regml(1),$str(*,$len($regml(2)),$regml(3)
}

For more tamper proof stuff: you could replace 'badword' with (for example) [b8][a@][d](?:w|\\\/\\\/)[o]r[d] but that's probably too much work smile
© mIRC Discussion Forums