Comments on: Hash Datatype Conclusions
Ok, a lot of good input posted on Change the Hash Datatype
in 3.0?. Thanks.
Generally, I like the strength of the REBOL language to be at the
datatype level. I'd really like to find a good solution to this
issue as a datatype -- not requiring mezz functions (or nothing more than "simplicity wrappers" at mezz level).
As you know, datatypes in REBOL span not only internal representation, but also methods of operation (e.g. block
vs hash) and external representation (e.g. block vs path). These are
important in REBOL, and provide insight into the range of valid solutions.
My view is: hash! works, but at the expense of implementational
complexity (e.g. it was buggy for "good reason"), additional overhead (strict block order maintenance), and a general impedance mismatch to most usage cases (e.g. dictionary).
I think it's this last case that defines the problem and hence dictates the solution. The primary goal of all REBOL datatypes is to provide a
simple concept -- our brains seek a pure, simple abstraction. We know
what integers are and do. We know what blocks are and do. But, hash!,
unfortunately that's a mystery to many developers. We don't think
of it as a nice concept "nugget".
Yet, the concept of a "dictionary" itself is a simple abstraction and
has found extensive use in many other languages. So, that sounds more
like a proper REBOL datatype, rather than hash.
I think that is the direction we need to go. We don't want to
google the usage of a datatype each time we need it. We want it to pop
into our brains... a simple abstraction. The concept of an associative
array or dictionary lookup does that.
Now comes that sweet and sour process of picking a good datatype name
for it. Do we want to still call it hash!, is it better to call it
assoc! or dictionary!, or can we find a better/shorter name? Got one? Post it. Thanks.
assoc! could mean something else (e.g. 'association, rather than 'associative-array; similar but ambiguous), so I would stick with dictionary!, maybe with a DICT constructor function as you suggested. Hash! also has a specific implementation implied in its name, while it implies fast lookup, may not be the actual implementation.|
as you said "the concept of a "dictionary" itself is a simple abstraction". Therefore, using the name "dictionary" or "DICT" as Gregg suggested seems a plausible choice for a name. I personally like "MAP" (short for mapping or relation) since it is a general mathematical concept mapping arbitrarily typed objects to arbitrarily typed objects like associative arrays do. However, at a technical level, I think "MAP" is not as widely known as "DICT" is. In terms of implementation the following note might be of interest.
Note: "A Judy array is a complex but very fast associative array data structure for storing and looking up values using integer or string keys."
A freely available C library implementation is available at:
I would go for dictionary! since it's used in other languages and people will most likely know what it is if they already know about it from other languages. It does the same thing as in other languages, right? Then it should be called the same. No need to be rebellious here. :-)|
I'm in for dictionary! as well. The meaning is more obvious. A lot of developers know what a hash or an associative array is, but it's something you have to think about for a second. You don't have to think about what a dictionary is, which gives you more spare brain cells for writing code.|
I also recommend selecting a word that has a very transparent meaning. Dictionary! or Dict! may be imperfect but seem like better candidates than any alternative I can think of.
If you select a different word, consider that every single manual, tutorial or instructional example will effectively oblige the author to explain that the construct is analagous to associative dictionaries commonly found in other programming languages.
If it looks like a duck and quacks like a duck, call it a duck, not a waterfowl.
Let's call it dictonary! and not dict!|
I would suggest map! because
a) it maps a key to a value
b) like felix wrote it's mathematical concept also
c) in java there's a Map interface/class that does exactly the same
d) it's shorter than dictionary! :-)|
Don't know if it makes sense. Just to mention that, in common language, a map is like the entry ot a word defined; the dictionary is the collection of those words. Logically, should it be adressed in the nomenclature of the language that a name->(set of values) relatioship is a different thing than the set of all those listed together. Maybe there is something to exploit there in the semantics. Of course, to some of us, the full length name should be the one retained: dictionary!. And, maybe, entry! as the short name of a database entry?
And last, Carl is really wise to give the users a chance to comment openly on design and naming issues. Rebol may not be open source, but still it is open to discussion.
Good comments regarding the name. Naming is always fun, right?
Gregg, thanks for getting the comments rolling... and I agree with your main point.
I like Edoc's note regarding dictionary: "every single manual, tutorial or instructional example will effectively oblige the author to explain that the construct is analagous to associative dictionaries." Yes, we want to avoid that.
Of course, I also like the name map that Felix suggested... because it is short and sweet (technical jargon for "valid usage"). I am tempted to use it, but there are plans regarding mapping functions for 3.0, so we may best want to avoid that conflict. If we have a function called MAP, we don't want it confused with a datatype called MAP!
Also, the length of the name "dictionary!" is not really a big deal, because unlike functions (that need the FUNC helper function to keep us sane), dictionaries are not used very often.
So, clarity wins over brevity in this case. Unless someone discovers a flaw in our logic, then Dictionary! it will be.
I can't possibly resist a "naming" challenge...0-o-o<
relationship! or relation! or warehouse! or organizer! or graph! (graph could be good but has too many other contexts) wallet! socket-organizer!
Hmmn, as I think of this, several of these could be names for derived datatypes of hash! but also it shows the narrow focus of dictionary! and so the challenge I see is to find a more general-purpose word than dictionary! but more meaningful than hash!
Maybe closet! or carton! provide a more generic name than dictionary! I kinda like closet! as the mystery of the internal hashing form being hidden inside is connoted by the word closet!
keyblock! , keyset!
Brevity could be book! or lexicon! but otherwise dictionary! is understood and expressive.|
database! might be sexy for people like me who think it's time to stick a fork in Oracle (or SQL or whatever...Oracle just makes better poetry). Or datastore! to use a less common name, but less marketable, IMO.|
I always thought 'dictionary' was a poor description of an associative arrray, although I admit alot of languages use it.
'map!' would be my top choice, but it looks like there are new mapping functions ( cool! ) which would make it ambigous.
it funny no one tought of it... many amigans here.
tag lists always stuck in my mind years after I stopped coding on amiga. It satisfies, many of the above comments and is very short too. And tag/item pairs are obvious to understand.
foreach [tag item ] taglist [
the difference between a tag list and a dictionnary lies in the fact that a tag list allows a tag to be present more than once... THAT is a very usefull feature.
we can then very easy add a tag item several times and then extract a list of all submitted items for any particular tag.
they can also act as dictionnaries, with the added constraint that only one tag of each is allowed. that could easy be a /unique refinement on insert and append for example.
dicts cannot act as tags, so I'd prefer tags. and also, is it just me or are dicts just a poor object?
I think tags actually allow more a broader cases. I often use objects for storage and having dictionnary! might must get dished... in the same way we all dished hash! and list! cause we have block!.
also the "uniqueness" of a taglist could be a property of the taglist itself.|
I'm fine with dictionary! if that's the concensus, but I personally don't like it - no real reason other than personal preference, though.
Given what it is, and how it's supposed to act, though, I'd borrow from another language (Lua) and call it a table!.
I wouldn't call it a map!, but I would like to take this opportunity to make sure that REBOL 3.0 has mapping functions built into it. They are incredibly useful, and IMO, REBOL is (and should continue to be) incredibly useful out of the box.
I support Sunanda with lookup!.
IMO this name makes very clear what for is useful this datatype.
I like lookup! too, but I also never had problems with hash! name, just change functionality to desired state ....|
In french, dico is often used as short for dictionary.|
In my opinion dictionary! is good as is easily understood by non native English speakers too.|
The faults in table! and taglist! are due to the confusion they can generate with databases the former and (HTML) tag! datatype the latter.
If any implementation of native databases is planned table! can be risky if an association to one of its uses is not meant.
Mario, you are right about the confusion, but you have me thinking that actually this solves one issue I was wrt using words as the key... why not actually use tags!
tags can contain any/most chars, including spaces. :-)
this is already valid rebol syntax, and would make lexical form of a taglist! (hash!, dict!, whatever) more differentiated wrt an object!
"lookup" implies action to me, so I wouldn't want that.
"table", to me, implies two unbounded dimensions, while this datatype is bounded in one dimension.
I don't have a problem using dict! as the actual datatype name, but with a DICT mezz for creation, I don't think it will buy us much. That said, if you provided dict! as an alias for dictionary!, I would bet money that dict! would get used 99% of the time. Dictionary! is more human friendly, because it's a real word, but we won't want to type it all the time.
keyset! (per Jacques) is actually a nice description of the datatype; i.e. keyed+unique. It's shorter and easier to type than dictionary!, but it isn't a familiar term.
"lookup implies action".
That's true Gregg!, but the suffix "!" implies datatype! :-)
Well, as I'm not an english speaker I must admit lookup! may be isn't a good choice
it is short, well defined
I never liked that a hash! could have an odd length?
Just kidding. dictionary! is typeable and readable and gets the point across. And there is no flaw in Carl's logic that I can see. (Although couple! got me when I first thought of it.)
brian, the problem with couple! is that people will be expecting expresions like:
(sorry, the holidays are getting to my head ;-)
lookup! is the best,
but dictionary! everyone knows it.
Keep hash! too please, at least for me.
It helps with debugging. probing a dictionary hurt the eyes because of that random order, with a hash new things are nicely at the end.
And in some cases a sequential thing with fast lookup helps. Especially for a diff i wrote. Very rarely usefull, but when it helps :)
Brainstorming: Would it be possible to mix dictionaries with objects? Its the same syntax. And we could bind strings too and when they are loaded they use that binding.
I sort of like taglist!, or maybe just tags!. Having said that, I admit that I have never used hash!. And in other circles, the term "tags" might be misconstrued.
How about orchard! (where pairs come from)?
Post a Comment:
You can post a comment here. Keep it on-topic.