|
Joined: Dec 2002
Posts: 1,893
Hoopy frood
|
OP
Hoopy frood
Joined: Dec 2002
Posts: 1,893 |
Since there are already case-sensitive versions for $remove and $replace, it would be nice if one existed for $count.
|
|
|
|
Joined: Jan 2003
Posts: 2,973
Hoopy frood
|
Hoopy frood
Joined: Jan 2003
Posts: 2,973 |
alias countcs {
/set -u0 %count 0
/set -u0 %l 1
while (%l <= $len($1)) {
if ($2 isincs $mid($1, %l, 1)) { /inc -u0 %count }
/inc -u0 %l
}
return %count
}
|
|
|
|
Joined: Dec 2002
Posts: 698
Fjord artisan
|
Fjord artisan
Joined: Dec 2002
Posts: 698 |
You could use $regex() for that, //echo -a $regex(abABaA,/A/g)
|
|
|
|
Joined: Jan 2003
Posts: 2,125
Hoopy frood
|
Hoopy frood
Joined: Jan 2003
Posts: 2,125 |
Nevertheless, $countcs, as well as iswmcs, mentioned in another thread, ought to be added, to complete the case-sensitive series (cs support exists for every other identifier/operator).
|
|
|
|
Joined: Dec 2002
Posts: 698
Fjord artisan
|
Fjord artisan
Joined: Dec 2002
Posts: 698 |
I didn't say I thought it shouldn't be added, or that it was a bad idea.
|
|
|
|
Joined: Dec 2002
Posts: 1,893
Hoopy frood
|
OP
Hoopy frood
Joined: Dec 2002
Posts: 1,893 |
KingTomato, thanks for your effort, but the alias you made isn't perfect because it (a) doesn't support multiple params as $count does, and (b) counts only single characters. Another version that comes to my mind: alias countcs {
var %c = 0, %i = 2, %p
while %i <= $0 {
!.echo -q $regsub($($ $+ %i,2),/([\.\(\)\{\}\+\*\?\[\]])/g,\\\1,%p)
var %c = %c + $regex($1,$+(/,%p,/g)), %i = %i + 1
}
return %c
} Nimue, you're right. This could be done using regex as well as $removecs and $replacecs, but, for that the user will have to learn regex, and memorize which reserved characters need escaping. I think a built-in identifier will make life easier for everyone, and, as qwerty said, is expected for completing the case-sensitive series. qwerty, I haven't seen $countcs mentioned before, that's why I brought it up here. If that thread wasn't too old, I'd just post it there, but I'm afraid no one reads it anymore.
|
|
|
|
Joined: Jan 2003
Posts: 2,125
Hoopy frood
|
Hoopy frood
Joined: Jan 2003
Posts: 2,125 |
I was talking about iswmcs being recently mentioned in another thread. I don't remember any thread about $countcs() but I guess that if Search doesn't reveal anything (I didn't check), there isn't any.
|
|
|
|
Joined: Feb 2003
Posts: 2,737
Hoopy frood
|
Hoopy frood
Joined: Feb 2003
Posts: 2,737 |
Why do you escape characters in a character class?
/([\.\(\)\{\}\+\*\?\[\]])/g == /([][.(){}+*?])/g
Not sure what the purpose of that expression is, but you don't need to escape (any?) characters in character classes.
- Raccoon
Well. At least I won lunch. Good philosophy, see good in bad, I like!
|
|
|
|
codemastr
|
codemastr
|
General responses:
Don't use isincs when checking a single character, use === which does a case sensitive comparison, that should be faster.
Definately don't use regex when regex is not necessary. For trivial processing regex doesn't fair as well as regular character processing. It is designed for more complex things. A small regex like is necessary here is wasteful both in speed and in memory usage.
Lastly, about escaping characters in character classes, Racoon is right (almost) you don't need escaping, except for one character, ]. If you simply had [a-z]] That (by most regex libs, PCRE included) is interpreted as the character class a-z followed by the literal character ]. You would need to do [a-z\]] to make it know that the ] is part of the character class and not the metacharacter to represent the end of the character class. Also you could use it on a -, [a\-z] means a,-, or z rather than a-z, but the more common syntax is simply to do [az-] since the - is at the end it is assumed to be a literal hyphen and not representing a set of characters.
|
|
|
|
Joined: Dec 2002
Posts: 1,893
Hoopy frood
|
OP
Hoopy frood
Joined: Dec 2002
Posts: 1,893 |
You're right. Quoted from man.txt: "All non-alphameric characters other than \, -, ^ (at the start) and the terminating ] are non-special in character classes, but it does no harm if they are escaped." According to that, we'll still have to escape the \] char, but the others may appear unescaped: /([.\][(){}+*?])/g.
|
|
|
|
Joined: Feb 2003
Posts: 2,737
Hoopy frood
|
Hoopy frood
Joined: Feb 2003
Posts: 2,737 |
right. Also, if I'm not mistaken, I think if you use [ and ] in your character class, that ] must proceed [ and they should (must?) be the first characters in the class. ie: [][zig]
At least, this is how I see it whenver ][ are used in a character class. Probably good form if anything.
- Raccoon
Well. At least I won lunch. Good philosophy, see good in bad, I like!
|
|
|
|
Joined: Feb 2003
Posts: 2,737
Hoopy frood
|
Hoopy frood
Joined: Feb 2003
Posts: 2,737 |
correct. - must be the very last character in a character class, and ] must be the very first character in a character class (should never have to escape it), if either are to be used literally.
- Raccoon
PS. I haven't tested it yet, but I believe if you escape a character that doesn't need to be escaped, that literal '\' is added to the class. Logically, you would think to simply require '\' as the last character to prevent possible confusion, but '-' already reserves that throne. '\-' could arguably be an escaped '-', but I suppose that would be the best form.
Last edited by Raccoon; 11/05/03 04:32 AM.
Well. At least I won lunch. Good philosophy, see good in bad, I like!
|
|
|
|
codemastr
|
codemastr
|
Well imho doing []a-z] seems somewhat confusing, if you ask me [\]a-z] makes it more obvious (to a person) exactly what you mean.
|
|
|
|
|