mIRC Homepage
Posted By: bloodfog Hash Tables & Regex Question - 23/04/05 03:53 PM
I'm new to hash tables, I know to use hadd hdel $hget, but what should i do to save my hash table on exit and set it on start? I mean all my data will be saved in a hash table, and I want it to be always there -without using -m flag- cos it'll be always there.

What's the best way to use regex for bad word detection, I mean $regex($1-,badword1|badword2) is just too simple and sometimes it is not working as it is supposed to do? What would you use for your bad word detection?

Hope you get me.
Posted By: JAFO Re: Hash Tables & Regex Question - 23/04/05 04:01 PM
For the hash file use 2 events.
An on start event using the /hload command.
And an on exit event using the /hsave command.
Theyre both in the help file and really easy to use.

/hload -sbni <name> <filename> [section]
/hsave -sbnioau <name> <filename> [section]
Load or save a table to/from a file.

Sorry, but i dont think i have ever used regex.
Posted By: SladeKraven Re: Hash Tables & Regex Question - 23/04/05 04:03 PM
On start using hmake if the hash file exists. Then load the data into the hash table.
Posted By: bloodfog Re: Hash Tables & Regex Question - 23/04/05 04:12 PM
Uhm, so It'll look like this?

on *:start:{ hmake settings 100 | hload settings settings.hsh }
on *:ext:{ hsave -o settings settings.hsh }

And what about my 'n00b' regex question? ;[
Thanks guys.
Posted By: SladeKraven Re: Hash Tables & Regex Question - 23/04/05 04:22 PM
On *:Start: {
  hmake settings 100
  if ($isfile(settings.hsh)) hload settings settings.hsh

On *:Exit: { hsave -o settings settings.hsh }

As for the $regex() I've never used it so I'm not too sure what to do on that front, what I can suggest is do a search for a $regex() keyword. Expand to 5 years, and if you haven't found what you're looking for maybe someone would have posted by then. smile
Posted By: qwerty Re: Hash Tables & Regex Question - 23/04/05 05:51 PM
The reason your regex doesn't always work can be either (or all) of the following:

- some of the bad words contain characters special to regex, like { or } or \ etc

- you didn't use regex quotes around the pattern, ie /badword1|badword2/i. Omitting quotes can fail if the first badword starts with "m" or if the said bad word contains capital letters (regex is case-sensitive unless you tell it otherwise, with the "i" modifier).

If I had relatively few bad words to watch for, I'd probably do it like this:
on @$*:text:$($+(/\Q,$replacecs(%badwords,\E,\E\\E\Q,$chr(44),\E|\Q),\E/iS)):#:{
  ban -k # $nick 2 Don't swear

This is similar to your regex way but also takes into consideration the things I said above. Note the %badwords variable; this should contain all bad words (or even phrases), separated by a comma. You'd set that variable once with
/set %badwords word1,word2,some phrase1,word3

If I had many words (more than 50), I'd use the technique described here. It's the fastest possible way of checking a line of text against many bad words. One thing only: $hmatch(swear_words,$strip($1-)) in that tutorial can be replaced by $hfind(swear_words,$strip($1-),1,[color:red]W)[/color] (capital W). Both work (at least for now), but $hmatch() is obsolete. The new version is $hfind(), so you'd better get used to using this instead.
Posted By: Kelder Re: Hash Tables & Regex Question - 23/04/05 11:21 PM
For the regex stuff: badword detection is hard, adding some characters, typo's, accented letters instead of normal ones, ... and any script fails

A pretty easy way would be $regex($1-,/\b(?:badword1|badword2|badword3)\b/iS)
This just checks if any of those words are in there, but not inside another word to prevent some false positives

If you want to replace them with *** or something:
var %newtext, %i = $regsub($1-,/\b(?:badword1|badword2|badword3)\b/igS,***,%newtext)

If you want to match the number of stars (make sure * is NOT a badword grin )
var %newtext = $1-
while ($regex(%newtext,/^(.*?)\b(badword1|badword2|badword3)\b(.*+)$/iS)) {
var %newtext = $+($regml(1),$str(*,$len($regml(2)),$regml(3)

For more tamper proof stuff: you could replace 'badword' with (for example) [b8ß][aáàäâ@ÄÂÁÀÃã][dÐ](?:w|\\\/\\\/)[oóòõôöÔÖÓÒÕ]r[dÐ] but that's probably too much work smile
© mIRC Discussion Forums