Qbase speed doubles!

I tried out something that I'd mentioned in the prior post as an optimization to make Qbase faster, and it really works better than I'd hoped.

The normal method used by $base2 and Qbase to create the display-string is to take the big number in the internal state and whittle it down by continuously dividing it by 10 or 16 or whatever is the outbase. So if the number is N digits long, that forces you to go through that step N times. The main difference between $base2/$base3 and $Qbase is that Qbase groups several digits into chunks so that each of those N digit-shavings has fewer loops to accomplish the task. While it does get faster for the final digits as you've whittled it to a much smaller number, it can still take a lot of time.

The new change is to simply whittle it down by outbase^squared, so you can gather 2 digits at a time instead of just 1. For example, when translating to base16, instead of having a number from 0-15 that's your 1 shaved digit, you now have instead a number 0-255 that you turn into 2 digits. The time needed to shave a digit out of the internal number is pretty much the same as the time to shave 2 digits, regardless if you're dividing by 16 or 256, so this would be close to half the time.

It's closer to 60% because there's a minor bit of extra work to chop the 0-256 into 2 numbers before finding them both in the out-alphabet. Plus, the time is higher than 50% because doing this requires the pockets of the internal state be 1/outbase the size they formerly could be, so for big numbers this can result in there being additional pocket chunks in the chain, and the additional chain link comes with the time needed for an extra loop.

Also this requires a final step at the finish because, now that you're processing outbase digits in pairs, this always results in an even-length string with a leading 0 if the string would've otherwise had an odd-length. So there's a minor extra step to trim off a leading zero.

This means that re-running the benchmark timings for the table in the previous post will have the times in the 'E' column be somewhere from 50-60% of the time listed in the table.

Note that the times in 'E' are greatly affected by the overhead of the scripting engine itself, due to the need to having many commands using $calc in order to simulate something that can be done with a single $calc by the 'B' column except for the ^ operator not working for numbers above 2^256 or so, and the way to get accurate integers at that scale from MAPM is to ramp up the precision for fractions to an extreme level, due to the internal MAPM method of doing division by creating 1/divisor as a long fraction so you can multiply by it, and the extra precision for the longer fraction makes it take longer for both that division and for the multiply-by-it. That can be a shortcut that's not noticeable for floats where any rounding is below the 30 digits of display, but it does affect calculations which need zero precision digits to the right of a decimal but need the left side to be accurate.

Also as an example of the overhead imposed on Qbase by doing extra commands, are the commands where I have 2 places that I'm checking to make sure that %QBase.maxcarry does not exceed another number which would cause an inaccurate doubles math result above 2^53. All this did was "if (var1 > var2) set var2 = var1", and none of this is happening in /bigfloat mode. When i simply commented out those 2 lines in the script, the benchmark time dropped by around 5%

So, I believe that if $base is internally doing things similar to how is done in column "B" (which is $base3 using $calc instead of the pow_3 alias scripted replacement), and could switch to using things more similar to column "E", that this would also speed up handling of large numbers. For the bottom row of the benchmark table that's using a 2048-bit number, these 2 changes changed the benchmark time from 1400 to 720.

There would be diminishing returns from trying to shave off 3 digits at a time, since the gain from 1/1 -> 1/2 is greater than from 1/2 -> 1/3, and processing 3 digits at a time instead of 2, is more than 3/2 the complication.

---

Below is the updated Qbase, where the changes mentioned here are speeding things up - only if you have $3 be like 10x or 16x instead of just the integer. I've also commented-out the double-check to see if an internal variable is exceeding 2^53, so you won't see that warning unless you find the comments having the '+++' and change these back to being not-comments.

--
Code
; only alias called by $Qbase is the validation at top
; $Qbase(number,inbase,outbase,zeropad,1=debug)
; bases can be 2-64 or 85, with zero being '0' only in base 2-36
; if $3 outbase contains 'X' then enables example speed optimizations for certain in/out combos
; strict enforcement of N-digit input number if $3 contains X or if either inbase/outbase > 36
alias qbase_validate_alphabet { var %pattern $+(^[,$2,]+$) | return $regex(foo,$1,%pattern) }
alias qBase {
  var %num $1 , %inbase $remove($2,X) , %outbase $remove($3,X) , %pad 0 $+ $4 , %v1 $v1 , %v2 $v2
  if (%inbase == 16) var -s %num $remove(%num,:,$cr,$lf,$chr(32))
  ;  clipboard %num
  if (($replace(%inbase,85,2) !isnum 2-64) || ($replace(%outbase,85,2) !isnum 2-64) || (%pad !isnum 0- $maxlenl) || (%num == $null)) goto syntax
  var %alphabet.36 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
  var %alphabet.52 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
  var %alphabet.62 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
  var %alphabet.64 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
  var %alphabet.85 !&'*+-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{}~

  if (!$regex(foo,%num %inbase %outbase %pad,^[A-Za-z0-9/+]+ [0-9]+ [0-9]+ [0-9]+)) goto syntax
  if     (%inbase isnum  2-36) var %alphabet.in %alphabet.36 , %fake.base 36 , %num $upper(%num)
  elseif (%inbase ==       52) var %alphabet.in %alphabet.52 , %fake.base 52
  elseif (%inbase isnum 37-62) var %alphabet.in %alphabet.62 , %fake.base 62
  elseif (%inbase isnum 37-62) var %alphabet.in %alphabet.85 , %fake.base 85
  else                         var %alphabet.in %alphabet.64 , %fake.base 64
  if (X isin $3) var %alphabet.in $left(%alphabet.in,%inbase) , %fake.base %inbase
  if ((%inbase == 16) && (0x* iswm %num)) var %num $mid(%num,3)
  if (!$qbase_validate_alphabet(%num,%alphabet.in)) goto syntax

  if (%outbase > %fake.base) var %fake.base $v1
  if     (%outbase isnum  2-36) var %alphabet.out %alphabet.36
  elseif (%outbase ==       52) var %alphabet.out %alphabet.52
  elseif (%outbase isnum 37-62) var %alphabet.out %alphabet.62
  elseif (%outbase isnum    85) var %alphabet.out %alphabet.85
  elseif (%outbase isnum 63-64) var %alphabet.out %alphabet.64
  ; for alphabets beginning with not-zero, use 1st digit of alpabet
  var %out.zero $left(%alphabet.out,1)

  bset -tc &qbase.alphabet.in  1 %alphabet.in
  bset -tc &qbase.alphabet.out 1 %alphabet.out

  ; support size of max(fakebase,inbase,outbase)
  var %squared.outbase 1
  if (X isin $3) var %squared.outbase %outbase
  var %carry , %working , %out.num , %pocketmax $calc( 9007199254740991 / %squared.outbase // %fake.base )
  ; all lines mentioning maxcarry are just debug to make sure didn't overflow 2^53
  var %QBase.maxcarry 0 , %ticks $ticksqpc , %debug

  if (x isin $3) {
    if ( %inbase  == %outbase) {
      ; faster string manipulation for commonly used zero padding like $base(num,16,16,8)
      var %out.num %num
      if (%out.zero $+ ?* iswmcs %out.num) var %out.num $regsubex(foo,%out.num,^ $+ %out.zero $+ +(?!$),)
      goto display_it
    }
    if (%inbase == 10) {
      ; Optimizes Base 10->anything to avoid mult/div/mod
      var %n $len(%pocketmax) - 01 , %-n 0 - %n , %carry_mult 10 ^ %n
      while (%num != $null) { var %working $right(%num,%n) %working, %num $left(%num,%-n) }
      goto ten_to_outbase
    }
    :continue_from_inbase_64
    if (%inbase == 16) {
      if (%inbase %outbase == 16 64) {
        ; Optimizes Base 16->64 to avoid mult/div/mod 6@4 -> mime 4@6 bytes
        if ($calc((((6- $len(%num)) % 6)+6) % 6)) var %num $str(0,$v1) $+ %num
        bset -c &qbase.tmp 1 $regsubex(foo,%num,/(..)/g,$base(\t,16,10) $+ $chr(32) )
        noop $encode(&qbase.tmp,bm) | var %out.num $bvar(&qbase.tmp,1-).text
        if (A?* iswmcs %out.num) var %out.num $regsubex(foo,%out.num,^A+(?!$),)
        goto display_it
      }
      if (%inbase %outbase == 16 2) {
        ; Optimizes Base 16->2 to avoid mult/div/mod
        var %out.num $replacexcs(%num,0,0000,1,0001,2,0010,3,0011,4,0100,5,0101,6,0110,7,0111,8,1000,9,1001,A,1010,B,1011,C,1100,D,1101,E,1110,F,1111)
        if (0?* iswmcs %out.num) var %out.num $regsubex(foo,%out.num,^0+(?!$),)
        goto display_it
      }

      if (%inbase == 16) {
        ; Optimizes Base 16->10 to avoid mult/div/mod
        ; if     (%fake.base isnum  2-16) var %n 12
        ; elseif (%fake.base isnum 17-36) var %n $len($base(%pocketmax,10,16)) - 1
        if     (%fake.base isnum  2-36) var %n $len($base(%pocketmax,10,16)) - 1
        else {
          var %carry_mult %fake.base , %n 1
          while ($calc(%carry_mult * %fake.base) < %pocketmax) { var %carry_mult $v1 | inc %n }
        }
        var %-n 0 - %n , %carry_mult 16 ^ %n
        while (%num != $null) { var %working $base($right(%num,%n),16,10) %working, %num $left(%num,%-n) }
        goto ten_to_outbase
      }
    }
    if (%inbase == 64) {
      if ( %inbase %outbase == 64 16) {
        ; Optimizes Base 64->16 to avoid mult/div/mod
        if ($calc((((4- $len(%num)) % 4)+4) % 4)) var %num $str(A,$v1) $+ %num
        bset -tc &qbase.tmp 1 %num | noop $decode(&qbase.tmp,bm)
        var %out.num $regsubex(foo,$bvar(&qbase.tmp,1-),/([0-9]+) ?/g,$base(\t,10,16,2))
        if (0?* iswmcs %out.num) var %out.num $regsubex(foo,%out.num,^0+(?!$),)
        goto display_it
      }
      if ( %inbase == 64notfaster ) {
        ; Optimizes Base 64->* to avoid mult/div/mod
        if ($calc((((4- $len(%num)) % 4)+4) % 4)) var %num $str(A,$v1) $+ %num
        bset -tc &qbase.tmp 1 %num | noop $decode(&qbase.tmp,bm)
        var %num $regsubex(foo,$bvar(&qbase.tmp,1-),/([0-9]+) ?/g,$base(\t,10,16,2))
        var %inbase 16 | goto continue_from_inbase_64 | goto from64
      }
    }
    if (%inbase == 2) {
      if (%inbase %outbase == 2 16) {
        ; Optimizes Base 2->16 to avoid mult/div/mod replace binary with hex digit after making mult of 4
        if ($calc((((4- $len(%num)) % 4)+4) % 4)) var %num $str(0,$v1) $+ %num
        var %out.num $replacexcs(%num,0000,0,0001,1,0010,2,0011,3,0100,4,0101,5,0110,6,0111,7,1000,8,1001,9,1010,A,1011,B,1100,C,1101,D,1110,E,1111,F)
        if (0?* iswmcs %out.num) var %out.num $regsubex(foo,%out.num,^0+(?!$),)
        goto display_it
      }
    }
    :from64
    if ( %fake.base isnum 2-36) {
      var %n $len($base(%pocketmax,10,%inbase)) - 1 , %-n 0 - %n , %carry_mult %inbase ^ %n
      while (%num != $null) { var %working $base($right(%num,%n),%inbase,10) %working, %num $left(%num,%-n) }
      goto ten_to_outbase
    }
  }
  ; handle the un-optimized inbase=2-64 outbase=2-64
  var %tmp %fake.base , %n 1 - 0
  while ($calc(%tmp * %inbase) < %pocketmax) { var %tmp $v1 | inc %n }
  var %-n 0 - %n , %carry_mult %inbase ^ %n
  while (%num != $null) { var %snip $right(%num,%n) , %buildworking 0 , %i 1 , %num $left(%num,%-n)
    while  ($mid(%snip,%i,1) != $null) { var %digitval $bfind( &qbase.alphabet.in ,   1, $v1 ).textcs - 1 , %buildworking $calc( ( %buildworking * %inbase ) + %digitval ) | inc %i }
    ;while ($mid(%snip,%i,1) != $null) { var %digitval $poscs( %alphabet.in       , $v1,   1 )        - 1 , %buildworking $calc( ( %buildworking * %inbase ) + %digitval ) | inc %i }
    ;debug warning +++ if (%buildworking > %QBase.maxcarry) { var %QBase.maxcarry %buildworking }
    var %working %buildworking %working
  }
  :ten_to_outbase
  var %debug %n %debug time %inbase -> 10:  $calc($ticksqpc - %ticks)  $qt(%working)

  if (%inbase %outbase == 10 10) {
    var %out.num $regsubex(foo,%working,/([^ ]+) ?/g,$right($+($str(0,%n),\t),%n))
    goto display_it
  }

  if ((X isin $3) && (%outbase isnum 2-36)) {
    var %outbase^2 %outbase ^ 2
    :next_outbase_digitsq
    ; strip leading all-zeroes pockets aka $gettok(%working,2-,32)
    while (0 * iswm %working) { var %working $mid(%working,3) }

    var %carry 0 , %i 0 , %tokens $numtok(%working,32) | while (%i < %tokens) {
      inc %i | var %temp $calc( $gettok(%working,%i,32) + (%carry * %carry_mult) )
      ;debug warning +++ if (%temp > %QBase.maxcarry) { var %QBase.maxcarry $v1 }
      var %carry $calc(%temp % %outbase^2) , %temp2 $calc(%temp // %outbase^2) , %working $puttok(%working,%temp2,%i,32)
    }
    :handled_digitsq
    if ((%working %carry != 0 0) || (%out.num == $null)) {
      var %dig1 $calc(%carry % %outbase)
      var %dig2 $calc(%carry // %outbase)
      var %out.num $bvar(&qbase.alphabet.out,$calc(1+%dig1)).text $+ %out.num
      var %out.num $bvar(&qbase.alphabet.out,$calc(1+%dig2)).text $+ %out.num
      ; var %out.num $bvar(&qbase.alphabet.out,$calc(1+%carry)).text $+ %out.num
    }
    if (%working != 0) goto next_outbase_digitsq
    if (%out.zero $+ *? iswm %out.num) var %out.num $mid(%out.num,2)
    goto display_it
  }
  else {
    :next_outbase_digit
    ; strip leading all-zeroes pockets aka $gettok(%working,2-,32)
    while (0 * iswm %working) { var %working $mid(%working,3) }
    var %carry 0 , %i 0 , %tokens $numtok(%working,32) | while (%i < %tokens) {
      inc %i | var %temp $calc( $gettok(%working,%i,32) + (%carry * %carry_mult) )
      ;debug warning +++ if (%temp > %QBase.maxcarry) { var %QBase.maxcarry $v1 }
      var %carry $calc(%temp % %outbase) , %temp2 $calc(%temp // %outbase) , %working $puttok(%working,%temp2,%i,32)
    }
    :handled_digit
    if ((%working %carry != 0 0) || (%out.num == $null)) var %out.num $bvar(&qbase.alphabet.out,$calc(1+%carry)).text $+ %out.num
    if (%working != 0) goto next_outbase_digit
  }
  :display_it
  ;debug warning +++  if (%QBase.maxcarry > $calc(2^53)) { echo -a 8,4 Warning: probable math error using value above 2^53 %QBase.maxcarry Please let maroon know which number/inbase/outbase produces this $1 $2 $3 n: %n fb: %fake.base  %working | halt }
  if ($len(%out.num) < %pad) var %out.num $str(%out.zero,$calc($v2 -$v1)) $+ %out.num
  if ($5) echo 3 -ag %debug tot $2 -> $3 time:  $calc($ticksqpc - %ticks)  n: %n
  if (%v1 == %v2) noop | return %out.num
  :syntax
  echo -sc info2 *$QBase(number,inbase,outbase [,pad length N][,anything=debug info]) $v1 vs $v2 : fb %fake.base n %n : $1-3
  halt
}