Arrays - mIRC Discussion Forums

mIRC Supporting the use of arrays would make several scripts tons easier to do, instead of having to do long things with $gettok or however you script around it.

And what would the syntax be?

mIRC supports arrays, use hash tables. The API would be the same anyway, in that the only sane syntax would be to have some kind of $array(name,index) identifier, which is equivalent to $hget(name,index)

Arrays and hash tables aren't equivalent, for reasons we've been over before. I would like to see arrays in mIRC.

Yes, this was discussed before. The fact remains that most users aren't concerned about the efficiency differences in deleting items in an Array, and access is nearly equivalent. Finally, if the minor API differences are really that scary, it's trivial to wrap the /hadd and $hget identifiers into a /array and $array combo which hides all of the "confusing" hash specific functionality.

Just to be clear, arrays and hash tables might not be theoretically equivalent, but a few very popular languages do fine without array implementations. I'm speaking of course about Javascript and PHP, which both implement "arrays" through associative arrays (or hash tables). For all intents and purposes, they are equivalent in the practical sense for a huge population of real world programmers and code. Practical and historical evidence suggests that using an associative array as a "true" indexed array is virtually indistinguishable to the programmer and at least functionally equivalent, so maybe we should drop the theoretics in the argument and focus on practical benefits.

Quote:

The fact remains that most users aren't concerned about the efficiency differences in deleting items in an Array

You're basing that on what exactly? That's insane. The inefficiency of using hash tables that way is ridiculous.

Quote:

if the minor API differences are really that scary, it's trivial to wrap the /hadd and $hget identifiers into a /array and $array combo which hides all of the "confusing" hash specific functionality.

They're two fundamentally different concepts. It isn't a "minor API difference". It isn't just "hash specific" functionality, there'd also be "array specific functionality", because the two things only marginally overlap which is why it's stupid to try and combine the two at all.

Quote:

a few very popular languages do fine without array implementations. I'm speaking of course about Javascript and PHP

PHP's conflagration of lists and mappings into a single "array" is one of the major flaws in the language. And Javascript has (mostly) proper arrays. The associative array behaviour available in Javascript is inherent to all objects as another means of accessing their properties. The consequences of that lead to some of the most broken parts of the language (eg. being unable to prototype new Array methods because they'll appear as items on any for loops).

Originally Posted By: starbucks_mafia

You're basing that on what exactly? That's insane. The inefficiency of using hash tables that way is ridiculous.

Immutable data structures are common. Would you be opposed to immutable arrays in mIRC? If you treat them as such, the structures themselves are fine. In any case, the "inefficiency" of using hash tables is not due to the hash tables themselves, but mIRC's god awful parsing slowness. That's not really an issue with using associative arrays. Adding extra syntaxes to get around flaws in the language isn't a good solution, it's just a workaround. There's nothing fundamentally wrong with using associative arrays themselves, so it's misleading to claim that the inefficiency is in the hash table-- it's not. There are things mIRC could do instead to optimize these slow points without adding in new syntaxes.

Originally Posted By: starbucks_mafia

They're two fundamentally different concepts.

This is another pointless computer science theoretics discussion, and quite wrong. The fact is that "associative arrays" are a generalization of indexed arrays, conceptually speaking. An associative array is a collection of objects in a mapping of one key to a value, usually in a one to one relationship. An indexed array is just a subset of an associative array whereby the keys are all indexed positive integer values. This is why the two types of arrays map so seamlessly together in languages like PHP/JS. There is no fundamental difference, except for the key type restriction in indexed arrays. All of the operations on associative arrays carry over for indexed ones, the only difference being that insertion/deletion involve changing other keys. I don't consider that "fundamentally" different.

But that's all besides the point-- I'm not interested in the theory. The fact that you can trivially create an array API based on hget/hadd (and even hdel, though we agree it's inefficient) proves that the APIs have minor differences. Here it is, for immutable(ish) arrays:

Code:

; Add a value to an array
alias aadd { hadd -m $+(array,$1) $calc($hget($+(array,$1),0) + 1) $2- }
; Get a value from an array
alias aget { return $hget($+(array,$1),$2) }

; USAGE:
; //aadd myarray foo | aadd myarray bar
; //echo -a $aget(myarray,1) == foo
; //echo -a $aget(myarray,2) == bar

Seems like a rather trivial API difference to me. Again, insertion/deletion could be wrapped in a similarly trivial way, with the efficiency caveats, of course.

Originally Posted By: starbucks_mafia

PHP's conflagration of lists and mappings into a single "array" is one of the major flaws in the language. And Javascript has (mostly) proper arrays.

This double standard doesn't actually make sense. Javascript and PHP share the exact same API and conceptual implementation of the "array". If PHP is "flawed" because of this decision, then so must Javascript be. In any case, this is a hugely subjective debate-- many would argue that it's one of PHP's most powerful features-- this same API is certainly one of JS's most powerful features. The efficiency might vary between the two implementations, but that is an implementation detail, not a conceptual one, and we're not here to discuss implementation details. How can assoc arrays be a flaw in one language but a "mostly ok" feature in another, conceptually? I think associative arrays make plenty of sense in a high level weakly typed scripting language like JS, PHP and even mIRC. Conceptually speaking, using hash tables as the associative arrays of the other languages makes perfect sense. I've used them in mIRC with little issue many times before.

Just FYI, mIRC could always optimize a hash table for insertion/deletion if a switch was supplied during creation. This is why I had previously suggested the -i switch in the thread you linked, to tell mIRC to optimize the keys as indexed integer values. Using such a switch, mIRC could collapse/shift indexes during insertion/deletion-- or even just store the internal structure as a pure array for super-duper-efficiency, if Khaled wanted. This would give you the power of associative arrays in one single interface that scripters are already used to, rather than splitting it up into an API to manipulate associative arrays with indexed integers and associative arrays for anything else. I think a unified interface would be easier.

No. Just... no.

No matter how many optimisations Khaled makes to the scripting language, a built-in array implementation is going to be far superior to any possible scripted alternative in terms of speed. It will also mean that scripts which use array functionality will be more portable; you can copy/paste a small script rather than having to provide the entire scripted array implementation along with it.

I'm not sure why you're so against this feature being added. mIRC has no built-in sorted data type; we have to script our own using the, in your own words, "god awful parsing slowness" of mIRC (with hash tables).

I haven't seen a good reason from you against arrays being implemented. Note: the fact that they can be scripted is not a good reason.

That's because he isn't against array implementation, he is only discussing the syntax.

Originally Posted By: Wims

That's because he isn't against array implementation, he is only discussing the syntax.

Two of his posts in this thread:

Originally Posted By: argv0

mIRC supports arrays, use hash tables. The API would be the same anyway, in that the only sane syntax would be to have some kind of $array(name,index) identifier, which is equivalent to $hget(name,index)

Originally Posted By: argv0

Yes, this was discussed before. The fact remains that most users aren't concerned about the efficiency differences in deleting items in an Array, and access is nearly equivalent. Finally, if the minor API differences are really that scary, it's trivial to wrap the /hadd and $hget identifiers into a /array and $array combo which hides all of the "confusing" hash specific functionality.

Just to be clear, arrays and hash tables might not be theoretically equivalent, but a few very popular languages do fine without array implementations. I'm speaking of course about Javascript and PHP, which both implement "arrays" through associative arrays (or hash tables). For all intents and purposes, they are equivalent in the practical sense for a huge population of real world programmers and code. Practical and historical evidence suggests that using an associative array as a "true" indexed array is virtually indistinguishable to the programmer and at least functionally equivalent, so maybe we should drop the theoretics in the argument and focus on practical benefits.

These seem to be arguing against their implementation by suggesting that existing features suffice, when they're not the same in my opinion.

I am quite simply arguing that associative arrays can and do successfully abstract indexed arrays in other languages, and can therefore be implemented in the same fashion in mIRC, ie., without changing any external script APIs / syntax (well, with no major syntax adjustments anyway).

Nobody ever suggested syntax changes; merely appropriate identifiers and commands to access an array structure.

Right, but I'm also saying that the API shouldn't need to change either (as it also isn't necessary in other languages to support both arrays and hash tables in one unified API).

One language made the mistake of believing that indexed arrays and associated arrays are the same. Javascript does not; the array-like syntax for accessing properties in Javascript is not in any way affecting an actual array. And versus that one language there are a thousand that realize you shouldn't put your apples and oranges in a single structure. PHP's associative arrays aren't the same as mIRC's hash tables anyway, for one thing they're sorted. Not that makes the design any less flawed...

In any case, the simple fact is that arrays and hash tables are not the same, so trying to shoehorn them into using the same commands and identifiers is ridiculous.

As for syntax:

Code:

/amake <aname> <1dem 2dem ... nDem>
/aset -i<dem1[,dem2,demN]> <aname> <value>
/aset <aname> {{value,value,value},{value,value,value},....}

/echo -a $aget(name,dem1,dem2,....)

Plus the add, remove, freeing, etc commands

Although we are used to using {}'s with arrays, I don't think that's a good idea in mIRC because {}'s have their own meaning and I think it could become a problem if we use them for arrays. I'd stick with just ()'s like any other command. It would be fairly easy to parse the following without needing to use {}'s.

Code:

/aset <aname> ((value,value,value),(value,value,value),....)

Or ;'s can be used between them (with or without the ()'s as shown above)...

Code:

/aset <aname> (value,value,value;value,value,value;....)

I just think it will be better overall to avoid trying to use {}'s for this when it will require a lot of changes (I believe) to the parser to know how to properly treat them and when to change indenting.

Does -i insert at an index (and therefore shift)? Or does it just set the element at the index. This is a fairly important distinction.

Also how would you shift indexes on a multi dimensional array? Which indexes shift? Note that if multi-dimensional arrays don't shift then you can just use hashes for multi-dimensional structures and simplify array commands to single dimension lists only-- drastically simplifying, at that.

Fair enough.

I don't really care about the implementation, would just like the feature.

It doesn't really make a difference if we have to use:

Code:

amake blah 100
aadd blah one
aadd blah two
aadd blah fourty three thousand

Or:

Code:

hmake -i blah 100
hadd blah one
hadd blah two
hadd blah fourty three thousand

Does it?

hixxy, what does the -i switch stand for with the hmake?

Originally Posted By: hixxy

I don't really care about the implementation, would just like the feature.

It doesn't really make a difference if we have to use: (code sample 1) Or (code sample 2)

Does it?

It does, though.

~6 new commands, 5 new identifiers, and a plethora of new documentation that not only has to all be kept in sync, but adjusted if any features shared between hashes and arrays ever change. Let me pose it to you the other way: do we really need an $hfind *and* an $afind?

I'm talking about from a user perspective. None of that matters to me; it may matter to Khaled (I suspect it would), but that's for him to decide.

As a user, it makes no difference to me what the implementation is so long as the feature is there. There seems to be little point in debating over the implementation when it makes no difference to the end user. Khaled is capable of making up his own mind.

Could you be kind enough to answer my question before Argv0's? It seemed to have been left out in the cold to favor the debate that's going to be up to Khaled ultimately.

-i for hmake is what argv0 proposed to tell to mIRC it is an array rather than an hash table, so mIRC can use the appropriate data structure internally

mIRC really needs a list structure even if it is just a pseudo @window that could be used with (some) window related commands but only involved a line buffer and no window creation and drawing. i'm all for an implementation of arrays that addresses this shortcoming!

Thanks, Wims. I thought it was a default, undocumented switch. Silly me.

Well no, I'm not asking for mIRC to use a different data structure, as hashes work fine. mIRC just needs to treat the keys as integers rather than strings, and shift indices on insert/delete, though this part is optional.

Originally Posted By: argv0

Just FYI, mIRC could always optimize a hash table for insertion/deletion if a switch was supplied during creation. This is why I had previously suggested the -i switch in the thread you linked, to tell mIRC to optimize the keys as indexed integer values. Using such a switch, mIRC could collapse/shift indexes during insertion/deletion-- or even just store the internal structure as a pure array for super-duper-efficiency, if Khaled wanted.

Then, why did you emit that idea (in bold) if it's not because you know that's what people want, especially when you know that's the best thing to do for efficiency ? I thought you wanted to keep the hash table syntax for simplicity but that you still wanted a real array internally.
Saying you want -i only for the shifting part because it would allow us to have an array-like way to store data is wrong for two main reasons: that's not the good way to do it (but you know that) and:

Quote:

The fact remains that most users aren't concerned about the efficiency differences in deleting items in an Array

this is your opinion, not fact, and they might be much more people that would use the real advantages of an array than what you think.
As a last resort, this is how you argued several time in the past so make your choice:

1) I want to see real array implemented with all the efficiency it implies
2) I want to see array implemented using hash table, with all the inefficiency it implies

Originally Posted By: Wims

Then, why did you emit that idea (in bold) if it's not because you know that's what people want

If you bolded the full phrase, I added: "if Khaled wanted". It certainly wouldn't be my choice of implementation, I'm just laying out all the options. I've made it clear quite a few times what I'd like to see, but of course I realize there are other opinions involved.

Originally Posted By: Wims

it would allow us to have an array-like way to store data is wrong for two main reasons: that's not the good way to do it

It's "wrong" because "that's not the good way to do it"? That's pretty vague-- care to elaborate? I don't see any idiomatic problems with storing array data in an associative structure-- it would be equivalent to storing an array of array pointer structures internally in C (minus the tiny overhead of the linked list hash structure that goes along with a hash impl). The point of -i would be to optimize the insertion/deletion by changing the hash function to simply be h(n) = n, and, again optionally, ensuring that only 1 value can be associated with a key by shifting on h(n+1) or h(n-1).

Reusing the existing hash infrastructure would mean a more stable implementation that takes advantage of all of the existing features for hash tables, of which there are many. Some of which you might be discounting are $hfind, /hsave, and /hload, any of which risk being omitted from an initial array implementation-- all of which, would share many important use cases for array structures. This is the argument I am posing.

Originally Posted By: Wims

1) I want to see real array implemented with all the efficiency it implies
2) I want to see array implemented using hash table, with all the inefficiency it implies

These two options are not mutually exclusive, and are also loaded logical fallacies. The "inefficiency" that everyone is pointing to regarding hash tables is unfounded and based solely on mIRC's parsing capabilities, not the efficiency of hash structures themselves. In other words, the claim is misplaced. Also note that there is nothing inherently more efficient about arrays and hashes. Let's review some CS 101:

Hash table lookup: O(1). Array lookup: O(1).

We're even so far.

Hash table deletion: O(1). Array deletion: O(n).

Hashes are more efficient! This of course implies different behaviour, though-- to make it fair:

Hash table deletion with shifting: O(n). Array deletion: O(n).

Equal. Insertion is the same.

So, show me where arrays are significantly more efficient? The theory shows that you are wrong-- many existing efficient hash implementations would agree with the theory (to the point where our own basis of benchmarking, $ticks, would probably not be able to differentiate). And, especially since most of the overhead in hashes can be optimized out if you know your key values are integer indexes, it almost becomes moot.

I didn't think -i would change the hash function, just do the shifting with the actual implementation, it's what I called the wrong way to do it

Even if it didn't change the hash function, the performance difference would be minor. Again, we're talking only the cost of overhead of a function that's probably not much more complex than sumof(characters) % bucketsize. If you want specific timings, you could probably implement proof of concept implementations, but I doubt you'll see a significant cost. And the benefits I described justify the cost IMO. That's not the wrong way to do it.