mIRC Home    About    Download    Register    News    Help

Print Thread
filter #164545 13/11/06 07:02 AM
Joined: Feb 2005
Posts: 344
B
Bullseye Offline OP
Fjord artisan
OP Offline
Fjord artisan
B
Joined: Feb 2005
Posts: 344
is there a way to let it filter out double line from a txt file.
I have a bot and people can add trivia questions to it.
I like to check the file now and then for double questions.
But searching by hand takes a lot of time.
so i hope that there is a way to read a line - loop the file searching for the same line and when found erase it - then go to the next line and loop the file again and so on and so on.

Greetzz

Re: filter #164546 13/11/06 09:01 AM
Joined: Sep 2003
Posts: 4,230
D
DaveC Offline
Hoopy frood
Offline
Hoopy frood
D
Joined: Sep 2003
Posts: 4,230
You could read the first line of the file, place it in a destination file, then filter "exclude" said line from the source file into the source file, thus removing all copies of it, and repeat untill the source file is empty

Following codes assume no blank lines exist in the source file

Code:
alias ex {
  write -c destinationfile
  var %line = $read(sourcefile,nt,1)
  while (%line) {
    write destinationfile %line
    filter -ffcx sourcefile sourcefile %line
    var %line = $read(sourcefile,nt,1)
  }
}


a better way however might be to do this

Code:
alias ex {
  window -c @tempwin | window -h @tempwin
  filter -fk sourcefile ex.filter.alias
  savebuf @tempwin destinationfile
  window -c @tempwin
}
alias ex.filter.alias { if $len($1) { aline -n @tempwin $1 } }

This Uses a filter to send each line of the file to an alias in $1 and that alias /ALINE -n's it to a hidden window, -n prevents duplicate lines from being created, finally just savebuf the window.

This method gets around such problems as * & ? in any file line, becuase they have wildcard matching meanings in the /filter -x of the first example may go a bit bad and remove more lines than it had inteneded to ex "What is this word amer***" this would match ALL lines begining with "what is this word", the second example well not fall over on this.

Re: filter #164547 13/11/06 10:29 AM
Joined: Feb 2005
Posts: 344
B
Bullseye Offline OP
Fjord artisan
OP Offline
Fjord artisan
B
Joined: Feb 2005
Posts: 344
Nice job.
Is there a way to make it work so it also filters the double line that are differtent by a cappital letter.
Like so:
amuzement: geef de naam van het enig flexibele wapen in het spel "clue"*een touw

Amuzement: geef de naam van het enig flexibele wapen in het spel "clue"*Een touw

Re: filter #164548 14/11/06 07:53 AM
Joined: Sep 2003
Posts: 4,230
D
DaveC Offline
Hoopy frood
Offline
Hoopy frood
D
Joined: Sep 2003
Posts: 4,230
Well to be absolutely sure id have to test this, but i think this would work, its the first method to get rid of exact dups, then passing it through a (likely) slower method to match upper/lower case

Code:
alias ex {
  window -c @tempwin | window -h @tempwin
  filter -fk sourcefile ex.filter.alias1
  write -c destinationfile
  filter -wk @tempwin ex.filter.alias2
  window -c @tempwin
}
alias ex.filter.alias1 { if $len($1) { aline -n @tempwin $1 } }
alias ex.filter.alias2 { if ($read(destinationfile,nts,$1) || !$readn) { write destinationfile $1 } }


ok so this one creates the hidden window, and filters to tyhe /ALINE -n alias to remove identicial lines, then filters to the 2nd alias which scans the destination file adding only lines that dont exist.

Something i noticed about the $read(,s) switch was that it matches lines begining with exactly the string passed, so * & ? are not expanded as in wildmathes, so the second alias well look for a line begininging with a match to exactly the text passed, note that what is returned by the $read is not the whole matched line, but anything remaining on the line, so a exactly matched line well actually return $null which is why you WANT to write a line out that returns something on $read as that means its not the same as $1 (pretty wierd i know)
Then to confuse the matter assuming its $null returned by the $read you then need to check if it was a matched exactly line, or no match at all which also returns $null, and this is the $readn which is zero if no match located.

* i do think there might be some small logic hole in this somewhere, but i cant for the life of me locate it. So try it and see how it goes.