REBOL 3.0

Comments on: Hash Datatype Conclusions

Carl Sassenrath, CTO
REBOL Technologies
2-Dec-2006 20:09 GMT

Article #0054
Main page || Index || Prior Article [0053] || Next Article [0055] || 31 Comments || Send feedback

Ok, a lot of good input posted on Change the Hash Datatype in 3.0?. Thanks.

Generally, I like the strength of the REBOL language to be at the datatype level. I'd really like to find a good solution to this issue as a datatype -- not requiring mezz functions (or nothing more than "simplicity wrappers" at mezz level).

As you know, datatypes in REBOL span not only internal representation, but also methods of operation (e.g. block vs hash) and external representation (e.g. block vs path). These are important in REBOL, and provide insight into the range of valid solutions.

My view is: hash! works, but at the expense of implementational complexity (e.g. it was buggy for "good reason"), additional overhead (strict block order maintenance), and a general impedance mismatch to most usage cases (e.g. dictionary).

I think it's this last case that defines the problem and hence dictates the solution. The primary goal of all REBOL datatypes is to provide a simple concept -- our brains seek a pure, simple abstraction. We know what integers are and do. We know what blocks are and do. But, hash!, unfortunately that's a mystery to many developers. We don't think of it as a nice concept "nugget".

Yet, the concept of a "dictionary" itself is a simple abstraction and has found extensive use in many other languages. So, that sounds more like a proper REBOL datatype, rather than hash.

I think that is the direction we need to go. We don't want to google the usage of a datatype each time we need it. We want it to pop into our brains... a simple abstraction. The concept of an associative array or dictionary lookup does that.

Now comes that sweet and sour process of picking a good datatype name for it. Do we want to still call it hash!, is it better to call it assoc! or dictionary!, or can we find a better/shorter name? Got one? Post it. Thanks.

31 Comments

Comments:

Gregg Irwin
2-Dec-2006 15:35:30
assoc! could mean something else (e.g. 'association, rather than 'associative-array; similar but ambiguous), so I would stick with dictionary!, maybe with a DICT constructor function as you suggested. Hash! also has a specific implementation implied in its name, while it implies fast lookup, may not be the actual implementation.
Felix Almonte
2-Dec-2006 18:54:10
Carl, as you said "the concept of a "dictionary" itself is a simple abstraction". Therefore, using the name "dictionary" or "DICT" as Gregg suggested seems a plausible choice for a name. I personally like "MAP" (short for mapping or relation) since it is a general mathematical concept mapping arbitrarily typed objects to arbitrarily typed objects like associative arrays do. However, at a technical level, I think "MAP" is not as widely known as "DICT" is. In terms of implementation the following note might be of interest.

Note: "A Judy array is a complex but very fast associative array data structure for storing and looking up values using integer or string keys." --> http://en.wikipedia.org/wiki/Judy_arrays

A freely available C library implementation is available at: --> http://judy.sourceforge.net/

Henrik
2-Dec-2006 19:07:39
I would go for dictionary! since it's used in other languages and people will most likely know what it is if they already know about it from other languages. It does the same thing as in other languages, right? Then it should be called the same. No need to be rebellious here. :-)
Brian Wisti
3-Dec-2006 4:46:13
I'm in for dictionary! as well. The meaning is more obvious. A lot of developers know what a hash or an associative array is, but it's something you have to think about for a second. You don't have to think about what a dictionary is, which gives you more spare brain cells for writing code.
Edoc
3-Dec-2006 10:20:40
I also recommend selecting a word that has a very transparent meaning. Dictionary! or Dict! may be imperfect but seem like better candidates than any alternative I can think of.

If you select a different word, consider that every single manual, tutorial or instructional example will effectively oblige the author to explain that the construct is analagous to associative dictionaries commonly found in other programming languages.

If it looks like a duck and quacks like a duck, call it a duck, not a waterfowl.

Robert
3-Dec-2006 12:00:53
Let's call it dictonary! and not dict!
ridcully
3-Dec-2006 12:30:31
I would suggest map! because a) it maps a key to a value b) like felix wrote it's mathematical concept also c) in java there's a Map interface/class that does exactly the same d) it's shorter than dictionary! :-)
Norm
3-Dec-2006 12:53:29
Don't know if it makes sense. Just to mention that, in common language, a map is like the entry ot a word defined; the dictionary is the collection of those words. Logically, should it be adressed in the nomenclature of the language that a name->(set of values) relatioship is a different thing than the set of all those listed together. Maybe there is something to exploit there in the semantics. Of course, to some of us, the full length name should be the one retained: dictionary!. And, maybe, entry! as the short name of a database entry?

And last, Carl is really wise to give the users a chance to comment openly on design and naming issues. Rebol may not be open source, but still it is open to discussion.

Carl Sassenrath
3-Dec-2006 13:37:08
Good comments regarding the name. Naming is always fun, right?

Gregg, thanks for getting the comments rolling... and I agree with your main point.

I like Edoc's note regarding dictionary: "every single manual, tutorial or instructional example will effectively oblige the author to explain that the construct is analagous to associative dictionaries." Yes, we want to avoid that.

Of course, I also like the name map that Felix suggested... because it is short and sweet (technical jargon for "valid usage"). I am tempted to use it, but there are plans regarding mapping functions for 3.0, so we may best want to avoid that conflict. If we have a function called MAP, we don't want it confused with a datatype called MAP!

Also, the length of the name "dictionary!" is not really a big deal, because unlike functions (that need the FUNC helper function to keep us sane), dictionaries are not used very often.

So, clarity wins over brevity in this case. Unless someone discovers a flaw in our logic, then Dictionary! it will be.

eFishAnt
4-Dec-2006 9:18:10
I can't possibly resist a "naming" challenge...0-o-o<

relationship! or relation! or warehouse! or organizer! or graph! (graph could be good but has too many other contexts) wallet! socket-organizer!

Hmmn, as I think of this, several of these could be names for derived datatypes of hash! but also it shows the narrow focus of dictionary! and so the challenge I see is to find a more general-purpose word than dictionary! but more meaningful than hash!

Maybe closet! or carton! provide a more generic name than dictionary! I kinda like closet! as the mystery of the internal hashing form being hidden inside is connoted by the word closet!

Jacques
5-Dec-2006 4:43:39
Some propositions:

keyblock! , keyset!

Chris
5-Dec-2006 9:55:19
Brevity could be book! or lexicon! but otherwise dictionary! is understood and expressive.
Sunanda
5-Dec-2006 10:34:09
lookup!
eFishAnt
5-Dec-2006 10:39:24
database! might be sexy for people like me who think it's time to stick a fork in Oracle (or SQL or whatever...Oracle just makes better poetry). Or datastore! to use a less common name, but less marketable, IMO.
Charlie
5-Dec-2006 12:34:04
I always thought 'dictionary' was a poor description of an associative arrray, although I admit alot of languages use it.

'map!' would be my top choice, but it looks like there are new mapping functions ( cool! ) which would make it ambigous.

Maxim Olivier-Adlhoch
5-Dec-2006 15:36:02
taglist!

it funny no one tought of it... many amigans here.

tag lists always stuck in my mind years after I stopped coding on amiga. It satisfies, many of the above comments and is very short too. And tag/item pairs are obvious to understand.

 foreach [tag item ] taglist [
   probe tag
]

the difference between a tag list and a dictionnary lies in the fact that a tag list allows a tag to be present more than once... THAT is a very usefull feature.

we can then very easy add a tag item several times and then extract a list of all submitted items for any particular tag.

they can also act as dictionnaries, with the added constraint that only one tag of each is allowed. that could easy be a /unique refinement on insert and append for example.

dicts cannot act as tags, so I'd prefer tags. and also, is it just me or are dicts just a poor object?

I think tags actually allow more a broader cases. I often use objects for storage and having dictionnary! might must get dished... in the same way we all dished hash! and list! cause we have block!.

Maxim Olivier-Adlhoch
5-Dec-2006 16:22:28
also the "uniqueness" of a taglist could be a property of the taglist itself.
Jeff M.
5-Dec-2006 18:38:13
I'm fine with dictionary! if that's the concensus, but I personally don't like it - no real reason other than personal preference, though.

Given what it is, and how it's supposed to act, though, I'd borrow from another language (Lua) and call it a table!.

I wouldn't call it a map!, but I would like to take this opportunity to make sure that REBOL 3.0 has mapping functions built into it. They are incredibly useful, and IMO, REBOL is (and should continue to be) incredibly useful out of the box.

Alc
5-Dec-2006 19:42:03
I support Sunanda with lookup!.

IMO this name makes very clear what for is useful this datatype.

-pekr-
6-Dec-2006 1:02:37
I like lookup! too, but I also never had problems with hash! name, just change functionality to desired state ....
jf_allie
6-Dec-2006 13:22:28
In french, dico is often used as short for dictionary.
Mario Cassani
7-Dec-2006 4:16:06
In my opinion dictionary! is good as is easily understood by non native English speakers too.
The faults in table! and taglist! are due to the confusion they can generate with databases the former and (HTML) tag! datatype the latter.
If any implementation of native databases is planned table! can be risky if an association to one of its uses is not meant.
Maxim Olivier-Adlhoch
9-Dec-2006 14:55:31
Mario, you are right about the confusion, but you have me thinking that actually this solves one issue I was wrt using words as the key... why not actually use tags!

tags can contain any/most chars, including spaces. :-)

taglist/<key name>

this is already valid rebol syntax, and would make lexical form of a taglist! (hash!, dict!, whatever) more differentiated wrt an object!

Gregg Irwin
11-Dec-2006 16:05:16
"lookup" implies action to me, so I wouldn't want that.

"table", to me, implies two unbounded dimensions, while this datatype is bounded in one dimension.

I don't have a problem using dict! as the actual datatype name, but with a DICT mezz for creation, I don't think it will buy us much. That said, if you provided dict! as an alias for dictionary!, I would bet money that dict! would get used 99% of the time. Dictionary! is more human friendly, because it's a real word, but we won't want to type it all the time.

keyset! (per Jacques) is actually a nice description of the datatype; i.e. keyed+unique. It's shorter and easier to type than dictionary!, but it isn't a familiar term.

Alc
11-Dec-2006 20:55:52
"lookup implies action".

That's true Gregg!, but the suffix "!" implies datatype! :-)

Well, as I'm not an english speaker I must admit lookup! may be isn't a good choice

tom
14-Dec-2006 21:36:28
onto!

it is short, well defined

tom
14-Dec-2006 21:44:54
I never liked that a hash! could have an odd length?

Brian Tiffin
16-Dec-2006 21:21:11
How about
  • cue!
  • clue!
  • cipher!
  • guide!
  • marshal!
  • herd!
  • gate!
  • twin!
  • couple!
  • key-value-pair-where-the-key-is-hashed!

Just kidding. dictionary! is typeable and readable and gets the point across. And there is no flaw in Carl's logic that I can see. (Although couple! got me when I first thought of it.)

Maxim Olivier-Adlhoch
21-Dec-2006 11:25:22
brian, the problem with couple! is that people will be expecting expresions like:

couple/wed
couple/copulate
couple/divorce

(sorry, the holidays are getting to my head ;-)

Volker
28-Dec-2006 6:01:30
lookup! is the best, but dictionary! everyone knows it.

Other thoughts:

Keep hash! too please, at least for me. It helps with debugging. probing a dictionary hurt the eyes because of that random order, with a hash new things are nicely at the end. And in some cases a sequential thing with fast lookup helps. Especially for a diff i wrote. Very rarely usefull, but when it helps :)

Brainstorming: Would it be possible to mix dictionaries with objects? Its the same syntax. And we could bind strings too and when they are loaded they use that binding.

mrzoon
16-Feb-2007 14:06:09
I sort of like taglist!, or maybe just tags!. Having said that, I admit that I have never used hash!. And in other circles, the term "tags" might be misconstrued.

How about orchard! (where pairs come from)?

Post a Comment:

You can post a comment here. Keep it on-topic.

Name:

Blog id:

R3-0054


Comment:


 Note: HTML tags allowed for: b i u li ol ul font span div a p br pre tt blockquote
 
 

This is a technical blog related to the above topic. We reserve the right to remove comments that are off-topic, irrelevant links, advertisements, spams, personal attacks, politics, religion, etc.

REBOL 3.0
Updated 23-Apr-2017 - Edit - Copyright REBOL Technologies - REBOL.net