mIRC Home    About    Download    Register    News    Help

Print Thread
/bset enhancements #270136 08/04/22 12:46 PM
Joined: Jan 2004
Posts: 1,745
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 1,745
Quote

1. /bset should return an error if a non-numeric byte value is used without using the -t switch


//bset -c &var 1 foobar | echo -a $bvar(&var,1-)

This is obviously a script error, but it currently fakes the user out by setting the &binvar to be $asc(foobar) instead of halting with an error. If the string is non-numeric text, nobody wants the string to be set as if the -t switch were used, but only for just the 1st character. Otherwise they would have used

//bset -c &var 1 $asc(foobar) | echo -a $bvar(&var,1-)
or
//bset -c &var 1 $left(foobar,1) | echo -a $bvar(&var,1-)

Quote

2. Give /bset some kind of -z switch which allows creating a &binvar from blank input


There are cases where a zero length &binvar is needed, and this would avoid needing to do this with slow and/or complex commands. The workarounds are currently like:

//noop $regsubex(name,$null,,,&var) | echo -a $bvar(&var) $bvar(&var,0)
or
//bset -t &var 1 A | noop $decode(&var,bm) | echo -a $bvar(&var) $bvar(&var,0)

One use case for a blank &binvar is to give it the same scope as "/set -u0 %global $null" where it can be seen by a subroutine alias and it gets cleaned up when the script finishes.

//bset -c &var 1 %string | echo -a result: $bvar(&var,1-)

Also, if %string is null, or contains only spaces, the above would halt as an error. But, when the script would would rather the binvar be created as zero length than to halt the script, the next example would create &var as empty if %string is blank, or contains only spaces that get stripped out of the command parm seen by /bset

//bset &var 1 | echo -a result: $bvar(&var) $qt($bvar(&var,1-)) should be &var ""

Quote

3. Allow /bset to support 32-bit input instead of 31-bit input.


Currently, the next command sets &var to be the lowest 8 bits of %number, but only if it's in the range [0,2^31-1]. That saves scripts from the need to do something like $calc(%number % 256)

//bset -c &var 1 %number | echo -a $bvar(&var,1-)

Since /bset supports this behavior, it would be more useful if it supports the whole [0,2^32-1] range. If not supporting 32bit unsigned ints, then it should do the next best thing, by casting negatives within the signed-int32 range accordingly. Currently, it always returns 255 for %number greater than 2^31-1, and currently reports negative numbers as 45 by seeing the codepoint for the minus sign. If it's not going to handle input as a signed int32, then a negative number without using -t should be an error rather than always being 45 due to the #1 change of not handling input as if text when -t isn't used.

It's great that it currently ignores fractions to avoid the need for $int(%number) or $calc(%number //1)

Quote

4. Better support for the 2-byte and 4-byte properties.


If someone is using any of .word .nword .long .nlong, $bvar should not be returning byte values. But that's what is happening when using either of the following 3 formats, where it's returning byte output as if the .prop were not used:

//bset -c &var 1 1 2 3 4 5 6 7 8 9 10 | echo -a $bvar(&var,1-2).nlong
//bset -c &var 1 1 2 3 4 5 6 7 8 9 10 | echo -a $bvar(&var,1-).nlong
//bset -c &var 1 1 2 3 4 5 6 7 8 9 10 | echo -a $bvar(&var,1,2).nlong

Nobody using any of these 4 .prop is wanting a byte value, so this should either be a halting script error, or preferably there could be support for returning multiple multibyte nlong values. If supporting multiple valid outputs, these 4 props should be similar to handling inputs without using the .prop's, where there's a halting error only in the similar situations where $bvar would return an error when harvesting byte values without the .prop

Below currently returns 123 as if the 1st 4 bytes are '0 0 0 123' instead of returning $null or error due to there not being 4 bytes found at that &binvar position:

//bset -c &var 1 123 | echo -a result: $bvar(&var,1-).nlong should be null because there aren't 4 bytes
//bset -c &var 1 123 0 0 0 1 2 3 4 | echo -a result: $bvar(&var,1-).nlong should be 2063597568 16909060

Quote

5. Would be great if better support for $bvar's 2/4 byte values enabled the ability to be able to easily use /bset to create them.


One reason for storing uint32's in a &binvar is that it's significantly faster to store an array in a &binvar than to use a text string where you handle them with $gettok, $addtok $puttok etc.

An example of using a &binvar as a byte[] array is the base85 decoder in
https://forums.mirc.com/ubbthreads....lowfish-improvements-wishlist#Post265960
where it was significantly faster to store the lookup table in a &binvar than to use $mid $pos $gettok etc. It's much faster to access the 19th item of a byte array as $bvar(&binvar,19) than $gettok(%array,19,32) , and faster to modify an array element using /bset &binvar 19 %number than /var %array $puttok(%array,%number,19,32)

However, that currently works only if the values are going to be in the 0-255 range, so if there are negative numbers or numbers larger than 255, it won't work right.

By allowing /bset to have a simple way to store 2-byte and 4-byte values, that would allow scripts to be faster by storing their numbers in a &binvar [] array as if a series of words or longs, which could then be /hadd -b'ed into a hashtable if needed.

These next examples show that there is a non-obvious way to store .nlong and .nword values into the &binvar by cheating and using $longip, but it adds complexity and slowness, and doesn't work with little-endian .long

//var -s %counter 123456789 | inc -s %counter | bset &v 1 $replace($longip(%counter),.,$chr(32)) | echo -a result: $bvar(&v,1).nlong

result: 123456790

//var -s %counter 12345 | inc -s %counter | bset &v 1 $calc(%counter // 256) $calc(%counter % 256) | echo -a result: $bvar(&v,1).nword

result: 12346

To use this text method for little-endian .word, swap the 2 calc's around

//var -s %counter 12345 | inc -s %counter | bset &v 1 $calc(%counter % 256) $calc(%counter // 256) | echo -a result: $bvar(&v,1).word

result: 12346

But trying to store a little-endian .long would need an alias that would use $longip then use /tokenize so that bset could see them in $4 $3 $2 $1 order, or would need 4 separate $calc's to create the 4 individual values:

//var -s %counter 123456789 | inc -s %counter | bset &v 1 $calc(%counter % 256) $calc((%counter // 2^8) % 256) $calc((%counter // 2^16) % 256) $calc(%counter // 2^24) | echo -a result: $bvar(&v,1).long : $bvar(&v,1-)

result: 123456790 : 22 205 91 7

Solution: just like /bset defaults to be creating/updating a &binvar as byte values but allows -t to override so the input is seen as text, add new switches which allows overriding the input to be seen as multibyte values.

* The -w and -l switches would see the input as words or longs needing to be stored, instead of as bytes. Using more than 1 of -w -l -t together would be a syntax error

* The -n switch would be valid only in the presence of the -w or -l switch, and would store the values in network aka bigendian order instead of little-endian

* The -p switch would cause the position value to be treated as the location within the &binvar where the Nth such multi-byte item would be located. This would simplify interacting with these items within an array, instead of needing to calculate them all the time.

Storing the .nlong as bytes 10-13 within the &binvar:

/bset -nl &binvar 10 123456789

... which is retrieved by $bvar(&binvar,10).nlong

/bset -nwc &binvar 1 2 3
... creates the &binvar containing the bytes "0 0 0 2 0 0 0 3"

The -p switch in the next example would cause the position to be wherever the 10th multibyte item would be located within a &binvar. For -l or -nl the 10th item would be the 4 bytes located at the 10*4-3'th byte, and for -w or -nw the 10th item would be the 2 bytes located at the 10*2-1'th byte:

/bset -nlp &var 10 123456789

--

While the range would normally be [0,65535] for -w and would be [0,2^32-1] for the -l's, 16-bit or 32-bit negative numbers could be cast, then stored the same way the bytes for -12345 would be stored in a 16bit or 32bit C++ array. If it's a problem to support 2 separate 32bit ranges, I suppose there can be some kind of switch to make it see the input as signed int32 instead of as uint32.

Quote

6. Allow $bvar to use the -p syntax of /bset


There's 2 ways I can think this can be mirrored in $bvar, to avoid the need for the user to be translating back and forth between the Nth item in the uint[] &array and the physical location where its 4 bytes are located. One way would be to have the 2nd N/range parameter able to be preceded by 'p', so the latter .nlong examples would access the 10th nlong at byte positions 37-40 instead of using the 4 bytes at 10-13 like

$bvar(&var,p10).nlong
$bvar(&var,p10-).nlong
$bvar(&var,p10-11).nlong
$bvar(&var,p10,2).nlong etc

The alternative would be a parm4 where the 'p' switch would go. In either case, 'p' should either be ignored or syntax error in the absence of -l or -w

As a digression of this point, I'm not sure why it needs to be that $bvar(&nosuchbinvar,1,1) and $bvar(&nosuchbinvar,1-) need to be treated differently than $bvar(&nosuchbinvar,1)

Quote

7. To make it easier to store &binvar arrays holding negative numbers, it would be great for $bvar to have additional .props or an 's' switch to go along with the 'p' switch, which allows the .prop values to be retrieved as signed instead of unsigned.


The example below shows that, even though .long in programming languages generally refers to a 'signed' variable, mIRC here shows them as unsigned.

//bset &v 1 255 255 255 255 | echo -a result: $bvar(&v,1).nlong and $bvar(&v,1).nword

To obtain -1 from the above, either 4 new .props like .slong .nslong .sword .nsword would be created, or have the 's' in the 4th parm.

While signed values can be created from unsigned, benchmarks show that $calc and $iif() tend to noticeably slow down scripts when they're used within an inner while() loop

//var %i 1 | while (%i isnum 1-30) { bset &v 1 $regsubex(x x x x,/x/g,$rand(0,255)) | echo 4 -a %i result: $iif($bvar(&v,1).nlong > 2147483647,$calc($v1 - 2^32),$v1) | inc %i }

... so it would be simpler if signed byte values could be written like:

/bset -s &v 1 -123 45 -67 89
/bset -nsl &v 1 -123 4567 -8901

Quote
8. /bread file S -1 &binvar


Instead of needing to use $file(filename).size or using 999999999, it should be enough to use -1 as the N length parm to just read all bytes available at that file offset:

//bread $qt($mircini) 0 -1 &v | echo -a $bvar(&v,0)

This complements the fact that /bwrite recognizes -1 as the N for dumping the whole &binvar to disk:

//bset -t &v 1 abc | bwrite -c delme.txt 0 -1 &v | echo -a result: $file(delme.txt).size
result: 3

Quote
9. /bread and /bwrite switch -p for position compatibility with /bset /bcopy $bfind


The F1 /help uses the term 'byte position' for /bread and /bwrite, but a more accurate term is 'file offset'. It's easy for users to be confused by the F1 description, due to the fact that strings begin at position 1 while in a &binvar, but they shift into beginning at offset 0 while on disk.

In addition to this, it would be great for /bread and /bwrite to have some kind of -p switch that makes them treat the file offset using the same 1-based position numbering system as used for &binvar strings. This would avoid the need for using /inc and /dec or $calc to translate back and forth between the 1-based lingo used by /bset and /bcopy vs the 0-based lingo used by /bwrite and /bread.

i.e. if reading something from disk into a &binvar, $bfind gives a result that's off-by-1 compared to where /bread thinks it's located on disk, because $bfind thinks the 1st byte is at 1 but bread/bwrite think the disk location is at zero.

In effect, this -p would simply make /bread and /bwrite see the S parm as a number to be decremented by 1 then used as the file offset, so that these would be equivalents

//bset -czt &v %pos string | var %disk_position %pos - 1 | bwrite -c filename %disk_position -1 &v
//bset -czt &v %pos string | bwrite -cp filename %pos -1 &v

(length = -1 would be write the whole &binvar)

Quote

read 1st byte from disk: bread file 0 1 &var
read 1st byte from disk: bread -p file 1 1 &var


Using file offset as 0 should either be an error when used with -p, or should not be decremented, because it should not be desirable that S=0 be changed into the -1 that makes the &binvar be appended to the disk file.

This is based on mIRC thinking that $mid(abcde,1) and $mid(abcde,0) are the same.

This next should either be an error, or should read the 1st byte's by treating S=0 same as S=1:
/bread -p 0 1 &var

Since all S=negatives are currently treated by /bwrite as if the same as S=-1 append, I guess it would be fine to ignore -p as having no effect when the S parm is already negative.

These both would read starting at the %Nth byte of the file:

Quote

/bread file.dat $calc(%Nth -1) %length &var
/bread -p file.dat %Nth %length &var

Re: /bset enhancements [Re: maroon] #270138 08/04/22 10:39 PM
Joined: Jul 2006
Posts: 3,880
W
Wims Offline
Hoopy frood
Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 3,880
1) it would be better, but given that it breaks backward compatibility for something that technically should not be used, it's probably not worth fixing.

2) You didn't really give strong example of why empty binvar are needed, an example is this one, happened to me helping westor communicating using RCON protocol over socket:

With a binary protocol, where a newline is not what define a packet, you must keep a buffer of the data until you have enough data to form a 'packet', in this case I simplified it so that whenever we have 4 bytes or more from our buffer, we delete those 4 bytes from the buffer. /bcopy does not accept a position larger than the length of the binvar, which is normal, but in this case when you reach 4 bytes exactly remaining in the binvar, bcopy will then fail, which is not handy in this case. You must then workaround it the way maroon suggested to create an empty binvar so that /hadd won't fail. One may argue this example is easily solved by setting a variable instead, and checking the variable before /hadding &buffer or else hdeleting that item, but it still illustrates the need.
So I agree and like your /bset suggestion to create an empty binvar, but I think that the -c switch should be extended to not give an error if no byte parameter is passed, to create an empty binvar

2bis) I also would like a new switch for bcopy (or maybe extending -c as well) which would allow a position to be larger than the length of the binvar and if a larger position is passed then the binvar is chopped at the position N, making the entire else not required in the above example
Code
ON *:SOCKREAD:test: {
  sockread $sock($sockname).rq &read
  noop $hget(buffer,buffer,&buffer)
  bcopy &buffer -1 &read 1 -1
  while ($bvar(&buffer,0)) {
      if ($bvar(&buffer,0) < 4) return
      if ($bvar(&buffer,0) >= 4) bcopy -c &buffer 1 &buffer 5 -1 
      else {
        bunset &buffer
        noop $regsubex(name,,,,&buffer)
      }
      hadd buffer buffer &buffer
  }
}


3-5-7) Having control about signed/unsigned, 16bits/32bits numbers for the input and the output would be superb.

4) same answer as 1)

6) yes but definitely the 4th parameter syntax, which preserves backward compat

8) Yes!

9) yes but I think the switch should be -f for file offset, rather than -p, which could be confused with -p of /bset.

Re: /bset enhancements [Re: Wims] #270146 10/04/22 09:06 AM
Joined: Jan 2004
Posts: 1,745
maroon Offline OP
Hoopy frood
OP Offline
Hoopy frood
Joined: Jan 2004
Posts: 1,745
Responses to your responses.

Your response for #1:

I don't see how it's breaking compatibility for /bset to report an error when someone makes a syntax error, instead of giving them something different than nobody would want. The default for /bset it to treat input as numeric byte values, so it should not switch to seeing it as text when the -t switch is not used. Someone feeding %string to /bset wanted the whole thing put into the &binvar. They wouldn't want it treated as if a byte value when it happens to be integer, or having 1 character of the string put into the &binvar when it's not. It would be as if $encode(&soitgoes) decided to use the contents of a binvar if a binvar of that name happened to exist at the time, but at other times would treat it as if it's a text string.

Your response for #4:

You could argue whether or not you think it's a useful feature, but I don't understand your claim that this interferes with backwards compatibility. Nobody is intentionally using $bvar(&v,1-).nlong and using the .property with the intention of being given 1-or-more byte values instead.

My proposed change would make the identifier's behavior conform with F1 help, where the 4 .properties don't say 'sure but only if you ask it for just 1 of them'. Note it says values plural not 1 value:

Quote
The word, nword, long, and nlong properties return values in host or network byte order.


Your response for #9:

Agreed. It probably makes more sense to have a different letter. I was just trying to make the switch letter be intuitive, and p-for-position is what I came up with, but this is fine.

Your response for #2:

First time you yell at me that my post wasn't long enough. Progress! smile

To keep it shorter I just listed 1 example in the post directly. But I also had a 2nd example in the base85 decoder script I linked, where I did this so that it has the same behavior as $decode does for mime, where it sets the &binvar as blank if it contains exactly 1 encoding character. i.e.

//bset -t &v 1 B | noop $decode(&v,bm) | echo -a blank binvar: $bvar(&v1) : $bvar(&v,0)

A 3rd way is in some of my other scripts to immunize a &binvar from identifiers that bomb if you feed them a binvar that doesn't exist:

echo -a debug message: $bvar(&no_such_binvar,1-)
echo -a debug message: $sha1(&no_such_binvar,1)
bcopy -c &destination 1 &no_such_binvar 1 -1
bwrite -c delme.txt 0 -1 &no_such_binvar

Regarding #2, you mentioned in channel that you thought a switch wouldn't be needed, and it could just create a blank binvar like "bset &var 1 $null" or create a &binvar containing 3 zeroes from "bset &var 4 $null". I was basing the -z switch from the new switch for /writeini which requires a switch to verify that the intent is to write a blank rather than accidentally doing it because a %variable was empty. Though, the opposite is being done by /write like in your suggestion, in the case of "write delme.txt" which does create a zero byte file when the write_string is 'missing'.