mIRC Home    About    Download    Register    News    Help

Print Thread
regex with binvar #260478 27/04/17 03:28 PM
Joined: Jul 2006
Posts: 3,715
W
Wims Offline OP
Hoopy frood
OP Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 3,715
While this may have been suggested in the past, I don't think it was pointed out why this is needed.
One simple example which pushed me to write this: the default mIRC tool for log file does not allow me to search via regex, I could suggest for it here but in the meantime I still need the functionality.
My log files are much bigger than 4150 characters, of course (so $regex is not an option).
You must be thinking, why I am not using /filter, for example, well that leads to the problem, how do I correctly search multiple line at once? I simply cannot.
Concrete example: I want to find someone saying 'hash table' in my log, but only when the next line contains, let's say 'efficiency'.
In PCRE that's about using:

/^TIMESTAMP NICK .*?hash table.*\nTIMESTAMP NICK .*?efficiency/s

Where TIMESTAMP and NICK handle the format for my timestamp and nick decoration.
Of course it's possible to workaround this, calling /filter -nk to find the first line and then check that the next line has what we want (would probably needs a $read call inside the custom alias called by /filter, this is terrible!)
I also think this is long overdue, I don't see why that wasn't added before.

I was thinking about improving $bfind in this regard, $bfind(&bvar,pattern) seems fine, if pattern is only a number then users can use delimiters, which shouldn't break script.

However this is only good for searching, I suppose replacing would also be great.

$breplace(&binvar,a,o,more,replacement) - no regex replacement, would be equivalent to /breplace but works with string
$breplace(&binvar,/(a)(.)/g,o \n \1).regex - regex replacement, behave the same as $regsubex with its new behavior regarding $regsub: returns the number of replacement made since returning the &binvar wouldn't be useful.

$breplacex would also be a thing, then.

These are just quick idea/syntax I came up with, it could be different as long as the functionality is there.


Looking for a good help channel about mIRC? Check #mircscripting @ irc.swiftirc.net
Re: regex with binvar [Re: Wims] #261729 20/11/17 12:05 AM
Joined: Aug 2016
Posts: 49
R
rockcavera Offline
Ameglian cow
Offline
Ameglian cow
R
Joined: Aug 2016
Posts: 49
It would be very interesting to have an identifier that would work with &binvar as input and regex to work with that input, not only for replacement, but also for search, since $bfind does not work with &binvar as input.


rockcavera
#Scripts @ irc.VirtuaLife.com.br
Re: regex with binvar [Re: rockcavera] #261730 20/11/17 11:07 AM
Joined: Jul 2006
Posts: 3,715
W
Wims Offline OP
Hoopy frood
OP Offline
Hoopy frood
W
Joined: Jul 2006
Posts: 3,715
Thanks for the support!
Quote:
since $bfind does not work with &binvar as input
$bfind does support binvar as input

Last edited by Wims; 20/11/17 02:50 PM.

Looking for a good help channel about mIRC? Check #mircscripting @ irc.swiftirc.net
Re: regex with binvar [Re: Wims] #261731 20/11/17 08:43 PM
Joined: Aug 2016
Posts: 49
R
rockcavera Offline
Ameglian cow
Offline
Ameglian cow
R
Joined: Aug 2016
Posts: 49
Thanks for correcting me, Wims.

Actually I wanted to mention that $bfind does not work with regex.

Disregard the final part "... since $bfind does not work with &binvar as input."


rockcavera
#Scripts @ irc.VirtuaLife.com.br
Re: regex with binvar [Re: Wims] #261929 15/12/17 03:26 AM
Joined: Jun 2008
Posts: 6
D
digitok Offline
Nutrimatic drinks dispenser
Offline
Nutrimatic drinks dispenser
D
Joined: Jun 2008
Posts: 6
I also support this idea.

Re: regex with binvar [Re: Wims] #261944 15/12/17 04:42 PM
Joined: Jul 2013
Posts: 27
K
kikuchi Offline
Ameglian cow
Offline
Ameglian cow
K
Joined: Jul 2013
Posts: 27
I also support this suggestion.

Re: regex with binvar [Re: Wims] #266720 21/01/20 01:08 PM
Joined: Apr 2010
Posts: 945
F
FroggieDaFrog Offline
Hoopy frood
Offline
Hoopy frood
F
Joined: Apr 2010
Posts: 945
Bumping this thread as Im in need of this feature


I'm currently working on an HTTP implementation. Within that implementation I need to verify header values are formatted correctly; A header's value is stored in a bvar and can be over the ~8k string-length limit imposed.

I need to check if the header's value only contains values in the ASCII range(1-126). Currently this requires a (slow) loop for what amounts to a OR check of values:
Code
alias isAscii {
  if (!$bvar($1, 0)) {
    return $false
  }

  var %x = 1, %len = $bvar($1, 0)
  while (%x < %len) {
    inc %x
    if ($bvar($1, %x) == 0 || $v1 > 126) {
      return $false
    }
  }

  return $true
}




To amend Wim's suggestion what I'd like to see is:
Code
$bfind(&binvar, start-position, [end-position], [name], pattern).regex
  Returns the starting position of the first found match

  &binvar
    The bvar to search

  start-position
    The starting position of which the search should begin
    Must be an integer value

  end-position - Optional
    The end position of which the search should stop
    Must be an integer value

  name - Optional
    The regex-name to use when referencing the match list via $regml()
    Must not be a numerical value

  pattern
    The regex pattern

  .regex
    Indicates a regex pattern has been specified


$breplace(&binvar, substring, newstring...)
$breplace(&binvar, substring, newstring...).cs
  Performs a text-based in-place substitution  on a binary variable
  Returns the number of substitutions made

  if .cs is specified, the search will be case-sensitive


$breplace([name], &binvar, pattern, subtext).regex
  Performs a regex-based in-place substitution on a binary variable
  Returns the number of substitutions made

  You can assign a name to a $breplace().regex call which you can use later in $regml() to retrieve the list of matches.


$regml([name], n, [&binvar])
$regmlex([name], m, n, [&binvar])
    Similar to the current implementation except the result is output to the specified &binvar
    If outputting to a &binvar, the length of the bvar is returned



The reason I have choosen a new identifier over altering $replace/cs and $regsubex is that of the end-result differing. With current implementations, the replace creates a new string and once substitutions have finished, the new string is returned. The functionality I'd like to see is that of substitutions being performed in-place

Last edited by FroggieDaFrog; 21/01/20 02:30 PM.

I am SReject
My Stuff
$bvar().hex /bset|/breplace -x hex [Re: FroggieDaFrog] #267625 23/08/20 03:22 AM
Joined: Jan 2004
Posts: 1,388
maroon Offline
Hoopy frood
Offline
Hoopy frood
Joined: Jan 2004
Posts: 1,388
Support for $bvar hex output and hex input for binary commands would be helpful when using $bfind().regex

While normally $bfind returns the position of the match, the .regex prop makes it return the number of matches instead of a position within the binvar, so your regex pattern must use a capture group and you must use $regml([name,]N).pos to find that position. However to make a regex match, in a binary string which doesn't contain UTF8 text, requires using \xNN where NN is a hex value 00-FF

Also, if /bset /breplace support input of replacement strings using the same hex alphabet, that would make things simpler and avoid miss-steaks.

Perhaps a .hex prop for $bvar to change the output to 2-digit hex, and a -x switch so the /bcommands would see the byte values (but not the position) as hex.

Here's some scripted aliases to sorta do what I'm suggesting.

Code
alias bset {
  if ($isid) { echo -sc info * No such identifier: $bset | halt }
  if (-* iswm $1) {
    if ((x isin $1) && (t isin $1)) { echo -sc info /bset don't use -x AND -t | halt }
    var %pattern /([0-9a-fA-F]{1,2})/g
    bset $1-3 $regsubex(foo,$4-,%pattern,$base(\t,16,10) $+ $chr(32))
  }
  else !bset $1-
}


//bset -x &v 1 61 62 63 | echo -a $bvar(&v,1-).text
result: abc

Code
alias breplace {
  if ($isid) { echo -sc info * No such identifier: $breplace | halt }
  if ($1 == -x) breplace $2 $regsubex(foo,$3-,/([0-9a-fA-F]+)/g,$base(\t,16,10))
  else !breplace $1-
}


//bset &v 1 1 2 3 255 4 5 6 | breplace -x &v ff fe | echo -a $bvar(&v,1-)
result: 1 2 3 254 4 5 6

Code
alias bvar2hex {
  if ($2 == $null) var %range 1- | else var %range $2 | if ($3) var %range $+($2,-,$calc($2 +$3 -1))
  if ($1) return $regsubex($iif(&* iswm $1,$bvar($1,%range),$1),/(\d+)/g,$base(\t,10,16,2) $iif(dot isin $prop && (!$calc(\n % 8)),. $+ $chr(32)))
  echo -sc info *bvar2hex(list of base10's) *$bvar2hex(&binvar) *$bvar2hex(&binvar,N|N-|N1-N2) *$bvar2hex(&binvar,offset,length) [.dot] formats output into groups of 8
}


//bset -x &v 1 $sha1(abc) | echo -a $bvar2hex(&v,1-)
result: A9 99 3E 36 47 06 81 6A BA 3E 25 71 78 50 C2 6C 9C D0 D8 9D

The /bset alias allows hex values to be bunched together, but if the token length is odd, the '0' is padded in front of the 1st output token created from it.

//bset -x &v 1 abcde 1 2 34 5 | echo -a $bvar2hex(&v,1-)
result: 0A BC DE 01 02 34 05

To improve readibility of the hex output, I added an option to have a period following every 8th byte.

//bset &v 1 $regsubex(foo,$str(x,94),/x/g,$calc(32+ \n) $+ $chr(32)) | echo -a $bvar2hex(&v).dot

If you have the base10 byte values and just want the hex equivalents:

//echo -a $bvar2hex(00 11 22 33 44 55 66 77 88 99 111)
result: 00 0B 16 21 2C 37 42 4D 58 63 6F

It's hard for the $bvar2hex alias to estimate whether the output from range 1- fits within the line length, because the byte value obtained from $bvar has a variable 1-3 length each, while the hex output is fixed at 2 each. So, to be safe, should probably limit output to a range of $maxlenl *1/4 values.