| | 
| 
| 
|  |  
| 
Joined:  Feb 2003 Posts: 2,737 Hoopy frood |  
| OP   Hoopy frood Joined:  Feb 2003 Posts: 2,737 | 
Battling with all the different scripting methods available, I'm curious to see how people would code this simple function.
 $min(N1[,N2][,Nx...])
 Returns the lowest value of one or more numbers passed.
 
 I'm looking for two different examples, feel free to post one or both:
 Fastest (benchmarked any way you choose)
 Cleanest (using what you'd call neat-readable code)
 
 You can use $0, $numtok, $ [ $+ [ %i ], $gettok, WHILE, GOTO, $eval, $+()... whatever you think is fastest or cleanest.
 
 The purpose of this is to gleem some insight on the different basic methods people use.
 
 - Raccoon
 
 PS.  Tip on built-in bounds checking:
 if ( text < 1 ) == false
 if ( text > 1 ) == true
 This is useful bounds checking as it will automatically ignore invalid user input, and you can still use $ifmatch if you want to.  If you were writing a $max function, you'd have to reverse this because text equates as greater-than a number.
 
 - Raccoon
 |  |  |  
| 
| 
|  |  
| 
pheonix
 |  
| pheonix | 
alias min {   var %i $0   var %g 1   var %min $1   while (%g <= %i) {     if ($eval($ $+ %g,2) < %min) && ($eval($ $+ %g,2) isnum) {        set %min $eval($ $+ %g,2)     }     inc %g   }   return The Minimum Value Is: %min } have max as well   alias max {   var %i $0   var %g 1   var %max $1   while (%g <= %i) {     if ($eval($ $+ %g,2) > %max) && ($eval($ $+ %g,2) isnum) {       set %max $eval($ $+ %g,2)     }     inc %g   }   return The Maximum Value Is: %max }  
Last edited by pheonix; 17/08/03 07:02 AM.
 |  |  |  
| 
| 
|  |  
| 
Joined:  Dec 2002 Posts: 168 Vogon poet |  
|   Vogon poet Joined:  Dec 2002 Posts: 168 | 
alias min { return $gettok($sorttok($1-,32,n),1,32) } |  |  |  
| 
| 
|  |  
| 
Joined:  Jan 2003 Posts: 2,973 Hoopy frood |  
|   Hoopy frood Joined:  Jan 2003 Posts: 2,973 | 
lmao, props to jerk.  Its the difference between reading the manual, and not.  |  |  |  
| 
| 
|  |  
| 
Joined:  Jan 2003 Posts: 149 Vogon poet |  
|   Vogon poet Joined:  Jan 2003 Posts: 149 |  |  |  |  
| 
| 
|  |  
| 
pheonix
 |  
| pheonix | 
after reading the help file about $sorttok, i came up with this: 
alias min {
  if (!$1) { return 2* $eval($min,0) $+ : Invalid Parameters }
  elseif (!$2) { return 2* $eval($min,0) $+ : Invalid Parameters } 
  elseif ($1) && ($1 !isnum) && (!$2) { return 2* $eval($min,0) $+ : Invalid Parameters }
  elseif ($2) && ($2 !isnum) && (!$3) { return 2* $eval($min,0) $+ : Invalid Parameters }
  else { 
    var %x, %i = $regsub($sorttok($1-,32,n),/[A-Z]|[a-z]/g,$null,%x)  
    return The Minimum Value Is: $gettok(%x,1,32)
  }
}
mine fails if you add characters like: :,./\?()*^%$£"!¬§   if your actually planning on testing these and cant be bothered to remove the code tags problem yourself, just ask :tongue:
Last edited by pheonix; 17/08/03 04:26 PM.
 |  |  |  
| 
| 
|  |  
| 
Joined:  Dec 2002 Posts: 117 Vogon poet |  
|   Vogon poet Joined:  Dec 2002 Posts: 117 | 
My snippet got placed on mircscripts yesterday:http://www.mircscripts.org/comments.php?id=2230 Stripped down code (removed props): 
alias max {
  var %max = $1, %i = $0
  while (%i) {
    if ($eval($ $+ %i,2) > %max) var %max = $ifmatch
    dec %i
  }
  return %max
}I didn't see a reason to check for non-numerical values since it works with text as well. (abc > abb ) alias min is almost the same: replace %max  with %min  and >  with < You can't use $sorttok because it would mess up if the input parameters contain the token character, it would work if you only want numerical input.   |  |  |  
| 
| 
|  |  
| 
pheonix
 |  
| pheonix | 
yeah, but thats not what the challenge is for    my maximum edit time on other post expired   here is my final entry   
alias min {
  var %i 1
  if (!$1) { return 2* $eval($min,0) $+ : Invalid Parameters }
  elseif (!$2) { return 2* $eval($min,0) $+ : Invalid Parameters } 
  elseif ($1) && ($1 !isnum) && (!$2) { return 2* $eval($min,0) $+ : Invalid Parameters }
  elseif ($2 !isnum) && (!$3) { return 2* $eval($min,0) $+ : Invalid Parameters }
  else {
    while (%i <= $0) {
      if ($eval($ $+ %i,2) isnum) { set %num %num $eval($ $+ %i,2) } 
      inc %i
    }
    tokenize 32 %num
    var %x, %i = $regsub($sorttok($1-,32,n),/[a-z]|[A-Z]/g,$null,%x)  
    return The Minimum Value Is: $gettok(%x,1,32)
  }
}
Last edited by pheonix; 17/08/03 06:16 PM.
 |  |  |  
| 
| 
|  |  
| 
Joined:  Jan 2003 Posts: 2,125 Hoopy frood |  
|   Hoopy frood Joined:  Jan 2003 Posts: 2,125 | 
How about this? alias max {
  window -hl @@
  aline @@ $*
  filter -wwcute 1 10 @@ @@
  var %a = $line(@@,1)
  close -@ @@
  return %a
}Regarding the original challenge in this thread, Jerk's method is almost certainly the fastest/shortest one to sort numbers .
Last edited by qwerty; 17/08/03 07:29 PM.
 |  |  |  
| 
| 
|  |  
| 
codemastr
 |  
| codemastr | 
But we aren't interested in sorting numbers, we are interested in finding the smallest. Therefore his can NOT be the fastest. The fastest sorting algorithm known is Theta(Nlg(N)) meaning, if there are N elements to sort, it takes N * log2(N) comparisons to sort it. Using a simple while() loop to go through all the entries and find the smallest is Theta(N) (it takes N comparisons). You're right, his way is the best way to sort it, but sorting is NOT the best way to find the smallest.    Edit: The fastest (least complex) way to do this is using: 
alias min {
  var %m = $1, %i = $0
  while (%i > 1) {
    if ($eval($ $+ %i,2) < %m) %m = $ifmatch
    dec %i
  }
  return %m
}
Thats similar to Rich's except it is slightly faster. Rich checks from $0 through 1. It is not necessary to check 1 ($1), because the inital value assigned to the minimum, we only have to check from $0 to 2 ($2). What I mean is, Rich's tests if ($1 < $1) which we know is never going to be true, so there is no need to test it.
Last edited by codemastr; 17/08/03 08:12 PM.
 |  |  |  
| 
| 
|  |  
| 
Joined:  Jan 2003 Posts: 2,125 Hoopy frood |  
|   Hoopy frood Joined:  Jan 2003 Posts: 2,125 | 
I'm sorry I said "sorting", I meant to say "finding the max". The $sorttok way is still the fastest way to do it and you, once again, ignore an important fact which I've already pointed out specifically to you in the past: sorting N items (Nlog(N) complexity) in the C language is faster than checking two values in a loop (N) in the mirc scripting language. You can't do the comparisons you just did when one algorithm is in C and the other in a higher-level language like mirc script. I won't even bother to benchmark it, as I'm sure of this, but it would perhaps be a good idea if you did.   |  |  |  
| 
| 
|  |  
| 
codemastr
 |  
| codemastr | 
Well, we were both wrong, according to my benchmarks. They have almost exactly the same speed. Both have strengths and weaknesses depending on the given input. The benchmark was tabulated using a call to each identifier 10000 times, each time with a random set of numbers (each number ranging from -4294967295 to 4294967295. Each set of numbers also contained a random amount of numbers ranging from 1 to 50. The input for each identifier was the same. The average was about a 100ms difference depending, which direction it was in varied. Sometimes mine came out ahead, sometimes Jerk's did. If you want to try it out yourself, the code is below. 
alias jerk_min {
  return $gettok($sorttok($1-,32,n),1,32) 
}
alias code_min {  
  var %m = $1, %i = $0  
  while (%i > 1) {  
    if ($eval($ $+ %i,2) < %m) %m = $ifmatch
    dec %i  
  }  
  return %m
}
alias bench {
  var %i = 0
  var %jerk_ticks = 0
  var %code_ticks = 0
  while (%i < 10000) {
    var %nums = $genrandnums($rand(1,50))
    var %start = $ticks
    .echo -q $jerk_min(%nums)
    var %end = $ticks
    %jerk_ticks = $calc(%jerk_ticks + (%end - %start))
    var %start = $ticks
    .echo -q $code_min(%nums)
    var %end = $ticks
    %code_ticks = $calc(%code_ticks + (%end - %start))
    inc %i
  }
  .echo -a jerk_min took %jerk_ticks ticks
  .echo -a code_min took %code_ticks ticks
}
;picks $1 random numbers
alias genrandnums {
  var %i = $1
  var %nums
  while (%i) {
    set %nums %nums $+ $rand(-4294967295,4294967295) $+ $chr(44)
    dec %i
  }
  return $left(%nums,-1)
}
;$rand doesn't like negative numbers, so here is one that does
alias negrand {
  if ($1 > -1 && $2 > -1) return $rand($1,$2)
  if ($1 < 0 && $2 < 0) return - $+ $rand($abs($1),$abs($2))
  else {
    var %hi = $calc($abs($1) + $abs($2))
    var %lo = $iif($1 < $2,$1,$2)
    return $calc($rand(0,%hi) + %lo)
  }
}
 |  |  |  
| 
| 
|  |  
| 
Joined:  Jan 2003 Posts: 2,125 Hoopy frood |  
|   Hoopy frood Joined:  Jan 2003 Posts: 2,125 | 
That's because, imo, you didn't benchmark these two aliases correctly. Without explaining why I think my benchmarks are more accurate (even though simpler in structure), here's what I used: alias max1 return $gettok($sorttok($1-,32,nr),1,32)
alias max2 {
  var %i = $0, %a = $1
  while %i {
    if ($eval($ $+ %i,2) > %a) %a = $ifmatch
    dec %i
  }
  return %a
}
alias bcmark { 
  var %i = $1, %t = $ticks
  while %i { $2- | dec %i }
  echo -s $2 time: $calc($ticks - %t)
}I then ran /bcmark 5000 max1 <list> and /bcmark 5000 max2 <same list> several times (at random intervals). The results are quite different that yours. At the first few runs, <list> was in both cases this: 34 34 68 29 09 34 563 653 354356 3563 3 35356 366 36 36 36 3452 653 236 The difference was very noticeable. A typical value pair from the various times I executed the commands is this: max1 time: 1094 max2 time: 8922 The I tried several times the same commands, with this input: 1 2 A typical result was this: max1 time: 437 max2 time: 1453 You can see two things from this: 1) max1 is always faster than max2 2) with different input lengths, max1 gives similar results while the results of max2 differ a lot. This confirms my thoughts that, since both aliases are made in mirc script, you can assume with acceptable accuracy that max1 is constant-time (it only calls an identifier once) while max2 is linear-time (N, as it loops through %i).     |  |  |  
| 
| 
|  |  
| 
codemastr
 |  
| codemastr | 
But your <list> is not sufficiently randomly chosen. When doing benchmarking, you have to try and anticipate real world needs. The best way to do this is to allow as much randomization as possible and then run the randomizations several thousands of times. The reason is, you don't know what the real use will be. You might think your benchmarks are more accurate, however having taken classes in benchmarking, you're taught a few major rules. #1 make sure both functions are tested under the EXACT same circumstances, #2 make sure the input to the functions is sufficiently randomly chosen, #3 only benchmark the code that needs to be benchmarked. Your benchmark doesn't follow #2, or #3. It doesn't randomize, for example, "34 34 68 29 09 34 563 653 354356 3563 3 35356 366 36 36 36 3452 653 236" which is what you chose, will work very fast if the sorting algorithm mIRC uses is Radix sort. It's optimized for when the numbers contain similar digits. So if mIRC does use radix sort, you've only shown that it is fast in that instance, not when each number has different digits. When testing a sorting algorithms there are generally 5 cases that must be tested:1.) the input is already sorted
 2.) the input is sorted in reverse order
 3.) the input is completely random
 4.) the input is partially sorted (1 2 3 5748 32154 7843)
 5.) the input is partially reverse sorted (3 2 1 5748 32154 7843)
 
 Your tests don't take any of those into account really.
 
 As for the third, you are benchmarking code that you don't care about. while %i { $2- | dec %i } is what is being benchmarked. All you really care about is the $2-. Granted, that same code is benchmarked for both which prevents any margin of error due to that, but you still only benchmark the function in question, not other code as well.
 |  |  |  
| 
| 
|  |  
| 
Joined:  Jan 2003 Posts: 2,125 Hoopy frood |  
|   Hoopy frood Joined:  Jan 2003 Posts: 2,125 | 
Heh, you don't get it, you never will. As in many times in the past, the point is lost deep in your technical, 'advanced' but essentially irrelevant and misleading explanations. Let's examine a part of your code that contains the most serious error:     var %start = $ticks
    .echo -q $jerk_min(%nums)
    var %end = $ticksYou record the $ticks difference from just one run of $jerk_min(%nums) and you expect accuracy...  guess what, /bcmark 1 max1 <list> (and max2 for many inputs) always reports 0 milliseconds on my computer. %end - %start will always be 0, so you don't record anything . Apparently, my computer needs less than a msec to run /max1 <some list> (or $jerk_min(%nums) for that matter). 10000 * 0 = 0 though. On a fast computer, you could end up with %jerk_ticks being 0 when your benchmark ends. If you don't understand what I'm saying and why what you did is wrong there's not much that I can do. However, this is the first thing you should be taught in your benchmarking class. Regarding $2-, yes, $2- adds a little overhead but it's a very small error factor; I was aware of it and certain that this error wouldn't alter my results, I just used it because I think /bcmark is a handy and reliable way to benchmark various aliases. Besides, your code also benchmarks an irrelevent item, the %nums variable (remember that you call $jerk_min(%nums) and not $jerk_min(1 2 3 5748 32154 7843) for example). Do you think it's ok to include %nums in the benchmark but it's not ok to include $2- or you just didn't notice? You can change my code around and benchmark the "real thing" but the results won't change.  As for the rest of the blabla about Radix algorithms etc, they have nothing to do with the issue. Sorting is done in C, it's pretty much a constant-time routine compared to mirc script. But we've already been through that...    |  |  |  
| 
| 
|  |  
| 
codemastr
 |  
| codemastr | 
You're right, I don't get it and it's not that I never will, it's that I don't want to. I'd prefer not to "get" something that I know is wrong.  |  |  |  
| 
| 
|  |  
| 
Joined:  Feb 2003 Posts: 2,737 Hoopy frood |  
| OP   Hoopy frood Joined:  Feb 2003 Posts: 2,737 | 
Unless I'm mistaken, or leaving out a bigger picture, qwerty is saying that your use of $rand is throwing off the benchmark completly. Random number generation is so slow, you could probably find the $min of 10 fixed values in less time than it takes to generate one $rand value... and certainly less time than it takes to generate 10 of them.  So it's no wonder that your results look identical for both functions, you are benchmarking $rand not $min. I could also bring up the time you used $lines in a WHILE loop, but that would be hitting below the belt.     What is sexy in C is not sexy in mIRC. -- A mighty scripter. |  |  |  
| 
| 
|  |  
| 
Joined:  Dec 2002 Posts: 1,253 Hoopy frood |  
|   Hoopy frood Joined:  Dec 2002 Posts: 1,253 | 
How can his use of $2- be construed as a loss of test integrity? It is used in exactly same way in both cases. If his test had changed ANY code (not possible since he's using the exact same /bcmark alias to test both), then you would be quite correct. But he doesn't, so you aren't. Who cares if { $2- | dec %i } adds overhead? It's the same  overhead for any  test run; that fact alone factors it out of the benchmark as a consideration. Sure, it might pump up the total numbers, but it will pump them up exactly the same for every test on which that alias is used. Do you mean to tell me that your variables require no allocation of memory? (That's one form of overhead, isn't it?) Jerk's method doesn't...not one variable used, nothing but native identifiers used. Your method also tests variables. (Doesn't memory management count as overhead, too?) Furthermore, both of you are using the stack unnecessarily (more memory management) when you could have just as easily inlined the two algorithms and tested just them inside their own loops (iterating to 47 gazillion) with %start and %end after each. Qualified remark (not having benchmarked it recently, I am unwilling to presume its veracity): the last time I checked, built-in $identifiers were faster than %variables of any sort. With respect to your Radix sort (you never mentioned if you were referring to a LDS or MSD Radix sort: is that important to how your LSDs or MSDs are not  going to be sorted into their appropriate buckets?), Theta (you made no mention in there of big-O notation nor even asymptotic bounds, upper and lower bounds of resource usages, though you could have but won't ) and the 5 different methods of testing the sorting algorithm that you're not going to test : ...   Where was I?    Oh yes, now I remember.   You are not testing $sorttok here, you are testing the RESULT of both the $sorttok and the $gettok it feeds. Then you are comparing the cumulative total time of that ENTIRE algorithm (not just some sorting algorithm you've (heard of|looked up|wrote a thesis on) somewhere) versus the results you achieve in your own algorithm. qwerty is quite right: mIRC isn't going to change its sorting method from one run to the next, nor is it going to spit out what the current sorting algorithm's name to verify/discredit anyone. If the two values were at all close (as your initial 0-result tests indicated), then it would make sense to check a compiled sorting routine's output based on input sorting order (or lack thereof) for further optimizational possibilities/problems. As it is, they're not even close.  :tongue:    |  |  |  
| 
| 
|  |  
| 
Joined:  Dec 2002 Posts: 117 Vogon poet |  
|   Vogon poet Joined:  Dec 2002 Posts: 117 | 
I think there's some errors in your test: 
  var %nums = $genrandnums($rand(1,50))
  ;now %nums should be a comma-separated list of 1 to 50 random numbers
  ...
  .echo -q $jerk_min(%nums)
  ;will retrun the same list, since there is only one token when using 32 for C
  ...
  .echo -q $code_min(%nums)
  ;will also retrun the same list, because %nums is now passed as $1
  $genrandnums won't work either, because it's calling $rand instead of $negrand, so it returns $str($chr(44),$rand(1,$1)).I made this test: 
alias jerk_min {
  tokenize 44 $1-
  return $gettok($sorttok($1-,32,n),1,32) 
}
alias code_min {  
  tokenize 44 $1-
  var %m = $1, %i = $0  
  while (%i > 1) {  
    if ($eval($ $+ %i,2) < %m) %m = $ifmatch
    dec %i  
  }  
  return %m
}
alias bench {
  var %i = 0
  var %jerk_ticks = 0
  var %code_ticks = 0
  hmake nums %1
  while (%i < $1) {
    hadd nums %i $genrandnums($rand(1,50))
    inc %i
  }
  var %i = 0
  var %start = $ticks
  while (%i < $1) {
    .echo -q $jerk_min($hget(nums,%i))
    inc %i
  }
  var %end = $ticks
  %jerk_ticks = $calc(%jerk_ticks + (%end - %start))
  var %start = $ticks
  var %i = 0
  while (%i < $1) {
    .echo -q $code_min($hget(nums,%i))
    inc %i
  }
  var %end = $ticks
  %code_ticks = $calc(%code_ticks + (%end - %start))
  inc %i
  .echo -a jerk_min took %jerk_ticks ticks
  .echo -a code_min took %code_ticks ticks
  hfree nums
}
;picks $1 random numbers
alias genrandnums {
  var %i = $1
  var %nums
  while (%i) {
    set %nums %nums $+ $negrand(-4294967295,4294967295) $+ $chr(44)
    dec %i
  }
  return $left(%nums,-1)
}
;$rand doesn't like negative numbers, so here is one that does
alias negrand {
  if ($1 > -1 && $2 > -1) return $rand($1,$2)
  if ($1 < 0 && $2 < 0) return - $+ $rand($abs($1),$abs($2))
  else {
    var %hi = $calc($abs($1) + $abs($2))
    var %lo = $iif($1 < $2,$1,$2)
    return $calc($rand(0,%hi) + %lo)
  }
}
Added tokenize to both aliases so they both work in this test and are slowed down equally. I don't know how much influence tokenize, $hget, while, %i and .echo have, but in this test $jerk_min was about 10 times as fast as $code_min: 100x
 jerk_min took 190 ticks
 code_min took 1750 ticks
 1750/190 = 9.210526
 jerk_min took 165 ticks
 code_min took 1725 ticks
 1725/165 = 10.454545
 jerk_min took 155 ticks
 code_min took 1696 ticks
 1696/155 = 10.941935
 1000x
 jerk_min took 1960 ticks
 code_min took 17770 ticks
 17770/1960 = 9.066327
 jerk_min took 1875 ticks
 code_min took 19476 ticks
 19476/1875 = 10.3872
 
 (Just found out $sorttok doesn't like negative numbers:$sorttok(-1 -2 1,32,n) = -1 -2 1 So $jerk_min doesn't work properly for negative numbers, but for positive numbers it's faster)          |  |  |  | 
 |