mIRC Home    About    Download    Register    News    Help

Print Thread
Joined: Apr 2010
Posts: 969
F
Hoopy frood
OP Offline
Hoopy frood
F
Joined: Apr 2010
Posts: 969
There's something mIRC's scripting engine is lacking: an intuitive way to preserve and access arguments passed to an event/alias. As such I suggest the following:

$args(n)
References an immutable list of the arguments(the initial values of $1-n) passed to events and custom aliases.

if n is zero, the total number of arguments is returned
if n is greater than zero, the nth argument is returned

First, this would preserve the arguments after retokenization. For events and aliases used as commands this isn't as big of a deal -- the arguments can be stored via something akin to /var %args $1- and accessed via $gettok(%args, n, 32) -- but with custom aliases to be used as identifers to preserve the arguments the script must loop over them and store each one separately:
Code:
//echo -a $example(a b c, 1 2 3, x y z, 4 5 6)

alias example {
  var %args0 = $0
  var %n = 1
  while (%n <= $0) {
    var %args [ $+ [ %n ] ] = $($ $+ %n, 2)
  }
  tokenize 32 ...
  ;; rest of code
}


For a known number of arguments, it may be simpler to store/lookup the arguments without looping but for cases where the alias has an undetermined number of arguments to be preserved a scripted loop is required and is relatively slow; which may lead to cases where mIRC stutters/freezers if such an alias is used extensively. If implemented, then preservation storage would already be done for the user and referencing an nth parameter should be far faster than using current methods:
Code:
; faster
echo -a $args(%n)

; slower
echo -a %args [ $+ [ %n ] ]
echo -a $eval(% $+ args $+ %n, 2)

Last edited by FroggieDaFrog; 30/11/15 12:55 AM.

I am SReject
My Stuff
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
I don't really understand the prime use case for a feature like this. As you mentioned, aliases and events don't really need this functionality since you can store values into a variable for later access. Identifiers are the prime candidate, but even then, only identifiers that use /tokenize are relevant here. I'm not really seeing that as a very common use case. Even the /tokenize command by itself is not all that common. It's been around for 16 years and I still rarely see it used in scripts. Then you have to select the subset of /tokenize cases that are used in identifiers, and even smaller still, the subset of those use cases where a variable number of arguments are accepted.

It seems as though something like $args() would only be needed for a fairly specialized use case ($identifier implementations that need to use /tokenize on var args). Any other usage can be easily scripted in a number of ways to keep track of original arguments. Specifically, the typical $identifier() implementation only takes a small number of args (1-4 is the typical range), so tracking them all as individual variables is really not as complex as you describe. You don't need to use $eval or [] for this use case, simply:

Code:
var %name = $1, %id = $2, %age = $3, %desc = $4
tokenize 32 %desc | ; tokenize the description


I personally wouldn't use (or recommend using) /tokenize in the above case, and just stick to $gettok() on %desc for readability, but I suppose I can accept the premise in which doing this is necessary.

Incidentally, storing the arguments in descriptively named variables is generally good coding practice, and something you should do even without /tokenize in the picture. $args(N) wouldn't actually help those using the above practice of storing their inputs in properly named variables for readability. I would still end up doing (and recommending) the following, just because it's more maintainable:

Code:
var %name = $args(1), %id = $args(2), %age = $args(3), %desc = $args(4)
tokenize 32 %desc | ; tokenize the description


Perhaps describing a use case where variable arguments are used in a custom identifier alias would help to explain the need for this identifier.


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Dec 2008
Posts: 1,515
Hoopy frood
Offline
Hoopy frood
Joined: Dec 2008
Posts: 1,515
I think that the SReject suggestion has a point and is very good idea, because if you are going to use a while loop into a huge number i think that the $args will be more faster (1-3 seconds depends the loop) so i support this idea +1.


Need Online mIRC help or an mIRC Scripting Freelancer? -> https://irc.chathub.org <-
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Originally Posted By: westor
because if you are going to use a while loop into a huge number i think that the $args will be more faster (1-3 seconds depends the loop)


This is a very hypothetical and unrealistic scenario, though, which is why I asked for a real use case. Given mIRC's line length limit, the longest $args could possibly be is only in the 100-1000 range, which is not a "huge number", though admittedly large for mIRC. That said, if you're iterating over every argument in this fashion, I would argue that the problem is much larger than just $args(). And incidentally, the /while loop itself will be a large part of the bottleneck. I guess I just don't see iterating over 100+ arguments in $1- as that common of a use case. And I'm unconvinced that $ [ $+ [ %n ] ] syntax is actually that much slower than $args() would be. According to my benchmarks, the eval syntax is actually faster than using something like $hget() (full benchmark code at bottom of post):

Code:
Command /itergettok completed 1000 iterations in 17125 ticks
Command /iterlist completed 1000 iterations in 7594 ticks
Command /itereval completed 1000 iterations in 6141 ticks
Empty loop completed 1000 iterations in 4234 ticks


It's a fairly significant difference too. In other words, you're better off using a %var than a hash table for small enough data sets. I'm sure $args() would be faster than a hash table, but based on the above numbers it seems like the bulk of the cost is coming from doing the identifier parse + dispatch (that would explain why [] eval is faster than an O(1) operation). It's hard to concretely say, but it's entirely possible that $args() may still end up being slower than $ [ $+ [ %i ] ], which might defeat the purpose if perf is the only reason for such an identifier. Also worth noting that commenting out the command in either benchmark and executing an empty loop runs in about 4000 ticks. That means that more than HALF of the above benchmarks is taken up solely from the /while portion of the command. In other words, if you really iterated over 1000 items, you'd still be freezing up for half that time in the while loop alone.

The performance issue and the functionality this identifier adds should be kept separate here-- especially since there might not even be a perf issue at all. If you want to solve a performance issue around the slowness of $gettok(), that can be solved without needing $args(). Doing so would be much more useful than adding $args(), since using $args() would only help with $1-, whereas a perf improvement on $gettok() (or adding list support to the language) would help with a much larger set of cases (any kind of string processing).

If this really is about perf, the suggestion should be for generalized list support everywhere, not just arguments-- or, otherwise, better perf on $gettok() (or even just /while loops), which doesn't need any syntax changes to the language and is feasible.

Benchmark code for above test:

Code:
alias iterlist { 
  var %num = $iif($1,$1,1000)

  ; Setup table with 500 items
  var %i = 1
  while (%i <= 500) {
    hadd -m iterlist foo $+ %i a
    inc %i
  }

  ; Run benchmark grabbing each hash item
  var %x = 1
  var %time = $ticks
  while (%x <= %num) {
    %i = 1
    while (%i <= 500) {
      noop $hget(iterlist, foo $+ %i)
      inc %i
    }
    inc %x
  }
  echo -a Command /iterlist completed %num iterations in $calc($ticks - %time) ticks
  hfree iterlist
}

alias itereval { 
  ; Setup $1- with 500 tokens
  var %num = $iif($1,$1,1000)
  var %str = $str(a $chr(32), 500)
  tokenize 32 %str

  ; Run benchmark grabbing each individual token
  var %x = 1
  var %time = $ticks
  while (%x <= %num) {
    %i = 1
    while (%i <= 500) {
      noop $ [ $+ [ %i ] ] 
      inc %i
    }
    inc %x
  }
  echo -a Command /itereval completed %num iterations in $calc($ticks - %time) ticks
}

alias itergettok { 
  ; Setup $1- with 500 tokens
  var %num = $iif($1,$1,1000)
  var %str = $str(a $chr(32), 500)

  ; Run benchmark grabbing each individual token
  var %x = 1
  var %time = $ticks
  while (%x <= %num) {
    %i = 1
    while (%i <= 500) {
      noop $gettok(%str, %i, 32)
      inc %i
    }
    inc %x
  }
  echo -a Command /itergettok completed %num iterations in $calc($ticks - %time) ticks
}


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"
Joined: Apr 2010
Posts: 969
F
Hoopy frood
OP Offline
Hoopy frood
F
Joined: Apr 2010
Posts: 969
First, lets be clear that this request is not predominately about be speed. Though its discussed, it isn't the leading factor for the request. The reason I've made this request is for the functionality of argument list preservation. Currently, after retokenization there is no way to access the original arguments list. It feels obtuse that /tokenize is meant for general string manipulation/accessing yet inadvertently interacts with the only point of reference to supplied arguments. mSL is the only language I know of that has such a cross-play between two unrelated data stores; as an analogy, it's similar to if /tokenize altered an unrelated hashtable.


Functionality over Convenience
This adds functionality that mIRC currently does not have. Once /tokenize is called the arguments list is destroyed unless the script has taken steps to store needed elements from the list prior to retokenization. Dependent on the case, this may be as simple as adding the arguments to the tokenized line or setting a few variables, but may end up as being a loop (which is slow in mSL) to create a set of dynamically named variables if the arguments supplied is of a variable length and not character-delimited.


Prime Focus and Use Case
The prime focus would be with in custom identifers that have a variabled number of arguments, execution blocks that need to loop over the arguments list, or execution blocks where the scripter wishes to make use of tokenize while still preserving a reference to the arguments.

With my JSON parser calls to $JSON() have a variabled number of parameters; I build a string from those arguments to pass to the COM object via a loop similar to:
Code:
var %call = json, %n = 1
while (%n < $0) {
  inc %n
  %call = $+(%call, $chr(91), $qt($ [ $+ [ %n ] ] ), $chr(93))
  ;%call = $+(%call, $chr(91), $qt($args(1)), $chr(93))
}
Dependent on the structure of the JSON this can be, in extreme cases, 100+ parameters to loop over with each call and happens with every call to $JSON(). For scripts that loop over multiple record-sets this can become quite heavy; I've seen snippets that use my JSON parser to loop over several 100 records.

In my theme I make an exception for a bot that relays messages between an IRC channel and ingame chat so messages from the game fit seamlessly into the channel buffer
Code:
on *:TEXT:*:#:{
  var %nick = $nick, %original = $1-
  if ($isRelayedMsg) {
    %nick = $mid($1, 2-, -1)
    tokenize 32 $2-
  }

  ; ...

  ; I'd've lost the original message without having the foresight to store the message prior to retokenization.
  xlog $cid $chan %nick %original

  ; ...
}



Speed
When speaking of mIRC scripting its hard to make an argument on speed alone, but when coupled with other supporting details it should be considered.

You say that it'd be better to use $gettok() and store the result in variables but -- aside from that approach not being valid for custom identifers -- its faster to use /tokenize and $n over $gettokbench#1,bench#2 when needing to reference more than a single token within a string. Even if you store only the tokens you need from a string in variables its still faster to (re)tokenize and reference $nbench#3.

When attempting to access the argument list by index -- such as when iterating over the inputs or having a variabled amount of arguments to delegate from within a custom identifier -- the speeds vary greatly dependent on approachbench#4 and I'd assume would be much slower than accessing a native index-based data-store.

Benchmark results and script at end of post


Meta
You say that using /tokenize and $n is hard to follow when compared to verbose variable stores but when looking at where data came from to fill those variables it can be hard to follow if $1 is referencing a passed argument or is the result of /tokenize; more so when /tokenize is used multiple times within the same execution block. It gets even more funky when a portion of the arguments are being looped over and you see something similar to $($ $+ %i, 2).


Benchmark Results
Code:
Iterations: 1000000
-
Bench#1 Single Token Retrieval -- $gettok vs /tokenize: 25187 vs 27375
-
Bench#2 Multi Token Retrieval -- $gettok vs /tokenize: 36156 vs 31406
-
Bench#3 Var vs Tok >> 21219 vs 21000
-
Bench#4 Eval -- [] vs $() vs $eval() >> 21032 vs 34437 vs 35094


Benchmark Code
Code:
alias bench1 {
  var %a $1
  !var %ticks $~ticks
  !while (%a) {
    !noop $gettok(abc def ghi jkl mno, 2, 32)
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias bench2 {
  var %a $1
  !var %ticks $~ticks
  !while (%a) {
    tokenize 32 abc def ghi jkl mno
    !noop $2
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias bench3 {
  var %a $1
  !var %ticks $~ticks
  !while (%a) {
    !noop $gettok(abc def ghi jkl mno, 2, 32)
    !noop $gettok(abc def ghi jkl mno, 3, 32)
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias bench4 {
  var %a $1
  !var %ticks $~ticks
  !while (%a) {
    tokenize 32 abc def ghi jkl mno
    !noop $2
    !noop $3
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias bench5 {
  var %a $1
  !var %ticks $~ticks
  var %2 = $gettok(abc def ghi jkl mno, 2, 32)
  var %3 = $gettok(abc def ghi jkl mno, 3, 32)
  !while (%a) {
    !noop %2
    !noop %3
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}
alias bench6 {
  var %a $1
  !var %ticks $~ticks
  tokenize 32 abc def ghi jkl mno
  !while (%a) {
    !noop $2
    !noop $3
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias bench7 {
  var %a $1
  var %tok 2
  tokenize 32 abc def ghi jkl mno
  !var %ticks $~ticks
  !while (%a) {
    !noop $($ $+ %tok, 2)
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias bench8 {
  var %a $1
  var %tok 2
  tokenize 32 abc def ghi jkl mno
  !var %ticks $~ticks
  !while (%a) {
    !noop $eval($ $+ %tok, 2)
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias bench9 {
  var %a $1
  var %tok 2
  tokenize 32 abc def ghi jkl mno
  !var %ticks $~ticks
  !while (%a) {
    !noop $ [ $+ [ %tok ] ]
    !dec %a
  }
  !var %ticks = $~ticks - %ticks
  return %ticks
}

alias benchit {
  echo -a Iterations: $1
  echo -a -
  echo -a Bench#1 Single Token Retrieval -- $!gettok vs /tokenize:  $bench1($1) vs $bench2($1)
  echo -a -
  echo -a Bench#2 Multi Token Retrieval -- $!gettok vs /tokenize: $bench3($1) vs $bench4($1)
  echo -a -
  echo -a Bench#3 Var vs Tok >> $bench5($1) vs $bench6($1)
  echo -a -
  echo -a Bench#4 Eval -- [] vs $!() vs $!eval() >> $bench9($1) vs $bench7($1) vs $bench8($1)
}

Last edited by FroggieDaFrog; 01/12/15 01:38 AM.

I am SReject
My Stuff
Joined: Oct 2003
Posts: 3,918
A
Hoopy frood
Offline
Hoopy frood
A
Joined: Oct 2003
Posts: 3,918
Originally Posted By: FroggieDaFrog
mSL is the only language I know of that has such a cross-play between two unrelated data stores; as an analogy, it's similar to if /tokenize altered an unrelated hashtable.


I know quite a few languages that support this, sh/bash scripting being the most obvious (and seemingly the influence behind the $1- design):

Code:
# run as:  ./foo arg1 arg2 arg3
# outputs: 
#   arg1
#   bar

#!/bin/sh
echo $1
set -- bar
echo $1


Perl allows for rewriting @_, which also represents the initial arguments. There is no specific command to do so, but it exposes the same functionality. That said, Perl does allow overwriting of arguments in another way with $_. For example, most functions operate by default on the $_ string, which may contain data from the last call. @_ and $_, just like $1- in mIRC, are just stores for data, nothing more.

Basically, $1- is simply the resulting store for the last call to /tokenize, with the assumption that mIRC calls /tokenize (internally) for you to seed the initial parameters. It's effectively a "register" based design. This seems completely consistent with all other register based designs in mIRC-- for example, $regex() fills $regml() of the last call, and future calls will override this value.

You're calling /tokenize and $1- unrelated (like an "unrelated hash table"), but they are entirely related by design. In fact, the very purpose for exposing /tokenize was specifically to allow scripts to re-write the argument list. I think you might be using /tokenize for an unintended purpose, i.e., to avoid $gettok() and effectively (mis)use $1- as a low-cost "array" syntax. It's certainly an interesting use of $1-, but it doesn't seem like the intended use.

It seems to me that what you really want is array support in mIRC, period. That way you're not needing to use /tokenize at all. That is something I think a lot of people (including me) can support, especially since it's much more generalized and useful for other non-argument-list cases too. But it shouldn't be exposed through /tokenize or $1-, it should be its own thing. Otherwise, you're effectively limiting array syntax to scripts only being able to operate on a single list at a time.

As a sidenote, the loop in your JSON example would actually probably be faster with $ [ $+ [ %n ] ] than $args(N), specifically:

Code:
%call = $+(%call, $chr(91), $qt($ [ $+ [ %n ] ] ), $chr(93))


Or, rather, put in another way, based on the benchmark I posted above, it's unclear that $args(N) would be useful in the above case-- it may end up still being slower, which means nobody would use it for this use case (since it's no more convenient either). This is relevant, because if array syntax does come, it should come as a core syntax, not an identifier (like hash tables), otherwise it too may end up being slower than bracket evaluation at the end of the day (due to the way parsing and function dispatch works in mIRC). Then again, if mIRC's parser / dispatch was ever to be optimized, that problem could be solved and it wouldn't need to be a syntax instead of an identifier (incidentally, using hash tables for a list would probably become faster than bracket-eval too, and maybe an array syntax would not even be necessary).


- argv[0] on EFnet #mIRC
- "Life is a pointer to an integer without a cast"

Link Copied to Clipboard