Bigfloat feedback

Quote
#1 .bf flag

#1. Just mentioning now before it's too late to change to consider a change.

//var -s %foo. [ $+ [ $me ] ] 1

As Wims did, I also thought that this could cause some scripts to behave differently based on someone's nick, but that can already happen for things in the [windows] section, such as getting a full screen window by being querying by the nick 'main', or having logging or font settings affected by being queried by nick 'wstatus'. My suggestion for a marker was something like %bf.var or %.bf.var since compound variables usually place the nick at the end, and having the marker at the beginning calls attention to the special nature of the variable when debugging.

At first I thought maybe also having the marker be case-sensitive, but that would probably cause too much problems in a language where case-sensitivity is not the norm. That's been one of my biggest problems when I'd tried to code in 'C', where i'd have unknown variables because of not having the case be perfect.

Quote
#2 -switch for /hdec /hinc + $hget(table,item).bf

Unless it's considered redundant, another possible way for a flag is to give a new switch to the commands like /set /var /inc /dec /hinc /hdec

While bigfloat can look at /set /var /inc /dec to see what kind of variable they use, it would be helpful for at least /hinc and /hdec to have a switch to put them into BF mode, because the workaround is to feed it a %var.bf name that needs to be either $null or 1

//hadd -m maroon foo $calc(2^53) | echo -a old: $hget(maroon,foo) | hinc maroon foo | echo -a new: $hget(maroon,foo)
old: 9007199254740992
new: 9007199254740992

//hadd -m maroon foo $calc(2^53) | var %foo.bf 1 | echo -a old: $hget(maroon,foo) | hinc maroon foo %foo.bf | echo -a new: $hget(maroon,foo)

old: 9007199254740992
new: 9007199254740993

//hadd -m maroon foo $calc(2^53) | echo -a old: $hget(maroon,foo) | hinc maroon foo %null_variable.bf | echo -a new: $hget(maroon,foo)
old: 9007199254740992
new: 9007199254740993

The switches for these commands are all case-sensitive, so a capital letter like -M or -B wouldn't interfer with -m or -b, but to add distinction to it, perhaps it should avoid those letters and instead use -f or -F

var -F %var 2 ^ 72
hinc -F maroon foo

A -f switch would allow enable using a value from another hashtable item withou needing to either use the /bigfloat command or create a dummy %var.bf just to use as a stepping-stone.

//hadd -mz maroon bar $calc(2^53+12345) | mhadd -z maroon foo $hget(maroon,bar) | timertest 44 1 echo -st result: $!hget(maroon,foo)

Unless restricted to hashtable itemnames like whatever.bf (which /hinc /hdec don't seem to do) it would also be useful to have $hget(table,item).bf to indicate if the 'foo' item is incrementing or decrementing in BF mode, or whether it would get stuck due to loss of doubles precision.

//hadd -m maroon foo $calc(2^53-9) | hinc -c maroon foo | timertest 44 1 echo -st result: $!hget(maroon,foo)

Instead of returning $true, maybe it should return the switch letter, which would allow using the return string as part of the switches.

Quote
#3 $round

#3 Yay, $round in bigfloat mode is much better at rounding '5' the correct direction, and my simplistic example on wikichip doesn't fail in bigfloat mode.

//bigfloat on | var %i 1 , %list | while (%i isnum 1-100) { var %list %list $round(%i $+ .05 ,1) | inc %i } | echo -a round: %list

Quote
#4 Bitwise

#4 To answer your reply on Bitwise. Some of the bitwise operator could be supported as integers. $isbit $biton $bitoff $and() $xor() $or() $not()

And only in some cases would there need to be a 'bits' parm to determine how to handle the number.

In the following example cases where they cannot return an accurate result, the design decision can choose between returning zero or $null in the absence of parm3.

The 'bits' parm3 can also be useful in doubles mode, where you want $and(-number,number) to be cast as 16 bits instead of the default 32 bits.

Since BF mode would no longer have 2^32-1 as the range limit, there would also be a design decision of what to do for something like $not($calc(-2^71)). I'm guessing that the error condition for the invalid parameters mentioned below should either return '0' or $null or halt with an error. It might be desirable for the error condition to be something that's impossible to be confused for a valid return value.

#4a: $isbit $biton $bitoff.

They will be much faster than translating large integers to base 2 in order to perform string surgery on them.

//bigfloat on | var %i 1 | while (%i isnum 1-64) { echo -a %i : $isbit(999999999999999999999999999999999,%i) | inc %i }

$biton and $bitoff should not return a result when num1 is negative and the bits parm is't used, because of the abiguity in how to cast it as positive.

As for $isbit(bignegative,bit-position), I don't see them requiring the 'bits' parm, except to limit the bit positions seen for a negative number. Without that bits parm, I *think* that since the 1-bits would extend infinitely to the left, the result would be '1' for all bit positions greater than 'something'.

#4b: $not()

Yes, this should require a 'bits' parm, because this 'flip all the bits', and need the 'bits' parm to know what 'all' means.

Quote
#4c1: num1 & num2 both positive $and(num1,num2) $or(num1,num2) $xor(num1,num2)

These 3 operators are perfectly safe to use when both num1 and num2 are positives. Any usage of the bits parm against positive inputs would be ambiguous.

[/quote]
#4c2: exactly 1 of num1/num2 is negative $and(num1,num2) $or(num1,num2) $xor(num1,num2)
[/quote]

I believe that $and and $or can both be used when only 1 of the 2 numbers is negative, by using the bit length of the positive number to determine which 2^n number should be added to the negative when casting it as positive. And from there, it could continue pretending that both parameters had been positive.

However, $xor(-1,positive) would be ambiguous without the 'bits' parm.

Quote
#4c3 both of num1/num2 are negative $and(neg1,neg2) $or(neg1,neg2) $xor(neg1,neg2)

$and(negative,negative) and $or(negative,negative) cannot return a result without the 'bits' parm, because there's no positive number available to determine how to cast the negatives.

However, $xor(negative,negative) can return a credible result without the bits parm, which would only serve as a bitmask to shrink the result. The infinite 1-bits to the left cancel each other out, and the biggest_negative can be used for casting both numbers to positive.

- -
Quote
#5 $calc ^ operator fails

My tests so far mainly involve big integers.

I was hoping for something that was exclusively big integer at arbitrarily large size, where fractions are ignored or invalid, but I know the floats will make a lot of people happy too.

And double-yay, because so far I'm finding the + - * / % operators do seem to perform accurate results against those arbitrarily huge integers, but as you mentioned about slowness, the calculation times become exponentially slower in relation to the increase in the size of the numbers. And I suspect that much of this is due to supporting fractions in case these integers happen to generate one.

However, while + - * / % seemed to retain accuracy when used against numbers several thousand bits in size, the ^ operator loses accuracy when the result is in the neighborhood of 2^103.

//var -s %a $calc(2^52-1) , %b1.bf $calc(%a * %a) , %b2.bf $calc(%a ^2)

* Set %a to 4503599627370495
* Set %b1.bf to 20282409603651661416747996545025
* Set %b2.bf to 20282409603651661416747996545020

The result ending with 025 is the correct answer.

//var -s %a.bf $calc(2^103)

result: 10141204801825835211973625643010

the correct last 3 digits should be 008 instead of 010, and as the 2^n gets larger, the number of trailing zeroes increases:

//var -s %a.bf 2 ^ 120

result:
1329227995784915872903807060280000000
should be:
1329227995784915872903807060280344576

- -
Quote
#6 $base() rounding/dropped digits

This test of $base found 2 kinds of fails from hex numbers having at least 26 digits: dropped digits and lost precision.

The //command below is set at 32 hex digits for 128 bits, and the test is to see if (hex -> integer -> hex) arrives back at the original hex number, where it can fail for either base conversion. When %hexdigits is 30 or greater I have yet to find it not fail from this test, but these are just random numbers within a huge range. I limit the output to avoid spamming myself.

//var %i 10000 , %hexdigits 32 , %count 0 | while (%i) { var %string $regsubex($str(x,%hexdigits),/x/g,$base($r(0,15),10,16)) , %b $base(%string,16,10) , %c $base(%b,10,16,%hexdigits) | if (%string !== %c) { if (%count !> 9) echo 4 -a %string !== %c : b: %b | inc %count } | dec %i } | echo -a fail: %count

However when %hexdigits is reduced to be around 27-29, the fail rate drops slightly, and ranges from 93% through 99%. From a glance they appeared to be caused by loss of precision among the least significant digits.

From setting %hexdigits to 25 or lower, I've yet to find a random hex number giving an error, but when %hexdigits = 26 there are rare failures somewhere around 8 per 10,000 where they completely drop a digit. One example:

//var -s %a CB28B03A8106893332FB39ABEC , %b $base(%a,16,10) , %c $base(%b,10,16)
* Set %a to CB28B03A8106893332FB39ABEC
* Set %b to 16095909438010123464851209169900
* Set %c to CB28B03A8106893332FB39ABF

The %b result is correct, but %c drops a digit in the process of rounding. The commonality seems to be when the hex number is translated from a base10 integer whose last 2 digits are '00'. But it's not all such numbers, or else the error would have shown up 1% of the time. Now that I knew something to look for, I then went back to look at longer hex strings, and found both the rounding error across multiple digits, and the digit being dropped. This next example also compares against the $base2() demo previously posted on the forum:

//var -s %a 7BA9BB2A5B63F8520160B5C82458 , %b1 $base(%a,16,10) , %b2 $base2(%a,16,10) , %c $base(%b1,10,16)
* Set %a to 7BA9BB2A5B63F8520160B5C82458
* Set %b1 to 2508183865617366621511317042243000
* Set %b2 to 2508183865617366621511317042242648
* Set %c to 7BA9BB2A5B63F8520160B5C825E

While the prior example had the correct answer when translating from hex to base10, this example has a larger hex number which reaches into the range where the %b1 translation to base10 has a rounding error while %b2 using $base2() is correct, though is slower. The output from the next command shows that the error in the lower bits is something other than adding extra '0' digits.

//var %hexlen 25 | while (%hexlen isnum 25-40) { var %a.bf $left($sha1(abc),%hexlen) | var -s %hexlen %hexlen, %b1.bf $base(%a.bf,16,10) , %b2.bf $base2(%a.bf,16,10) | inc %hexlen }

If $base() losing precision near 26+ hex digits is due to using an algorithm that uses the same thing used by the ^ operator, that can explain the loss of precision in these results, but may not explain the digit being dropped. I tried briefly to find a result where 2 digits are dropped, but couldn't. I also didn't try to see if a digit can get dropped when translating hex->base10.

I think I can come up with something that is as fast as $base2() without using pow(). I'm not sure how my algorithm would translate to executable, because it would chop the number into separate pocket chunks instead of keeping it as a single string. But it wouldn't need to calculate 2^400 while translating a 512-bit number.

I also have something that would repair the ^ operator, but it only works for non-negative integer exponents.

--

Some combinations of in/out bases can have alternate calculations that would be much faster, and even faster without the need to support the entire base36 alphabet when translating from hex-to-base10, etc. Such as, translating from base16 by left-padding an odd-length with a '0' then storing the hex pairs as bytes inside the internal storage array. Then, each of those bytes are a 'pocket' where the 'carry' value isn't larger than 255. While the $base2() demo translated everything to base10 before translating to outbase, for executible code I imagine the efficiency would be translating inbase to the hex format of the internal storage, and from there translate to outbase.

Quote
#7 $bytes
$bytes is another candidate for the larger BF range:

//var -s %foo.bf 2 ^ 72 | echo -a $bytes(%foo.bf,b)

Quote
#8 $rand $rands
I'm curious how the $rand result is obtained for a range size larger than 2^64?

As I understand how the $rands SystemFunction036 works, gives you the number of random bytes requested, and the large range could be obtained by just requesting more bytes. But JSF64 returns just a single uint_64, so I wasn't sure if the BF results are obtained by appending several JSF outputs together until the underlying pseudo-random results is >= the requested range size.

//var -s %256bitrand.bf $rand(0,115792089237316195423570985008687907853269984665640564039457584007913129639935) , %randhex $base(%256bitrand.bf,10,16,64)

Or, is BF mode trying to use the Knuth LCG that's baked into the bigfloat package? If I understand how this LCG works, it uses a 49-bit prime 'm', and returns a number in the range [0,m-1] which matches the new internal state. It adjusts the returned number to the range by dividing by 'm' to get a fraction between 0-1, which can then be multiplied based on the requested range size.

JSF64 has a 64*4=256-bit internal state, and regardless of the value of the prior 64-bit output, it's possible for every 64-bit value to be the next output, including repeating the prior value, and it's not trivial to determine what the next output is.

On the other hand, the LCG returned output is 100% of the value of the internal RNG state, and each of the 49-bit RNG outputs can't be returned again until an interval of 2^49 outputs has passed where each of the other 49-bit numbers have their turn at being returned by the LCG. If 'm' is known, someone can request a range size matching 'm', which will return the value matching the entire internal state, which allows them to predict all the following 'random' number outputs except when something else interrupts to ask for its own random number.

--

Whatever is happening with $rand in BF mode, it's causing the time for $rand(1,9) to increase 7x, which is something I'm not seeing for similar small range like $calc(2+2) being in/out of BF mode.

//var %z 99999 , %i %z , %t $ticksqpc | while (%i) { !var %a.bx $rand(1,9) | !dec %i } | var %t1 $calc($ticksqpc - %t) , %i %z, %t $ticksqpc | while (%i) { !var %a.bf $rand(1,9) | !dec %i } | var %t2 $calc($ticksqpc - %t) | echo -a bf mode %t2 / doubles mode %t1 = $calc(%t2 / %t1)

Quote
#9 $pi Returns the value of the mathematical constant pi to 20 decimal places.

Nobody laugh, but in doubles mode the increased precision from $pi can affect the result, as in $calc(100000000*$pi). Since $pi in doubles mode had enough digits to make it less likely to be the source of a rounding error, it seems reasonable that - in BF mode where fractions are returned with 30 digits, that the 20 digit fraction of $pi should be lengthened to have more digits than that.

In doubles mode, the fraction for $pi was 20/6 = 333% the length returned by $calc, so since $calc can return 30 digit fractions, then $pi keeping up with that would have 100 digits.

And no I don't think the fraction needs to be that long, but should probably be at least slightly more than the 30 $calc can return.