|
Joined: Jan 2005
Posts: 25
Ameglian cow
|
OP
Ameglian cow
Joined: Jan 2005
Posts: 25 |
i have this REAL HUGE file, contains most of known dictionary words. i m attempting to make a simple $SpellCheck alias, and was wonderin which method i could use to search thru this file the fastest. this is what i got so far:
;note: ive loaded file using /hload -n
spellcheck {
var %i = 1,%m = $numtok($1-,32),%o
while (%i <= %m) {
var %w = $gettok($1-,%i,32)
var %w2 = $+(/\b,%w,\b/g)
if ($hfind(spell,%w2,0,r).data < 1) { %o = $addtok(%o,%w,44) }
inc %i
}
return $iif(%o,%o,$false)
}
it works, but still is kinda slow (file has 22759 lines and is 1.6mb). any suggestions? would /loadbuf be better? thnx =)
- I AM -
|
|
|
|
Joined: Sep 2003
Posts: 4,230
Hoopy frood
|
Hoopy frood
Joined: Sep 2003
Posts: 4,230 |
It would have helped if you defined what its ment to do as well, rather than leveaing us to work it out. It appears your pass 1 or more words ($1-) & check each for being present in the file, if NOT then you add them to a list of UNPRESENT words that you return, else you return $false Im going to work on that assumption (i think its right)
;note: ive loaded file using /hload -n
spellcheck {
var %i = 1,%m = $numtok($1-,32),%o
while (%i <= %m) {
var %w = $gettok($1-,%i,32)
[color:orange]if (!$istok($gettok($cr $1-,$+(1-,%i),32),%w,32)) {[/color]
var %w2 = $+(/\b,%w,\b/g)
if [color:blue]($hfind(spell,%w2,1,r).data)[/color] { %o = $addtok(%o,%w,44) }
[color:orange]}[/color]
inc %i
}
return $iif(%o,%o,$false)
} This is only a small thing, but this IF checks if %w has already been checked, and if so it doesnt need to be checked again This i think might speed things up, before you were getting the grand total of matches for %w, so after it found the 1st one it would continue to search for others, which it likely didnt find anyway, but why waste that time searching, with a 1 in there now it well return the 1st matching item name (in this case a item number) or $null if it did not find one, thus you save at least the time of seasrching for match number 2 and onwardsOher things to worry about.... One thing to be concerned for is what you might be passing as $1- unless its just words, you can be creating off regex's as some symbold cause regex matching etc Also i dont think your dealing with case insensitivity in the regex (i might be wrong im not good with regex myself)
|
|
|
|
Joined: Jan 2005
Posts: 25
Ameglian cow
|
OP
Ameglian cow
Joined: Jan 2005
Posts: 25 |
lol, sorry bout being sumwhat vague back there but you got my point straight-on nonetheless =). thanx for the reply. anyways u brought to light many thing i would otherwise have overlooked (like the part bout checking an item twice). the file has common terms all in small caps, and acro's in upper, so i meant for the $regex test to be case sensitive. and thanks for reminding of regex patterns stuff, i was planning to strip-out regex syntax, but nearly forgot (if it werent for u). thanx a heap! =)
- I AM -
|
|
|
|
Joined: Apr 2005
Posts: 11
Pikka bird
|
Pikka bird
Joined: Apr 2005
Posts: 11 |
It might work faster if you split the file into 26 smaller files, one for words starting with A, one for B, ... And then find the first letter of the word your checking with $left and search the corresponding file.
A shotgun is always the answer.
|
|
|
|
Joined: Jan 2005
Posts: 25
Ameglian cow
|
OP
Ameglian cow
Joined: Jan 2005
Posts: 25 |
cool idea! i'll try it out
- I AM -
|
|
|
|
|