Comments on: The ZERO? question
I have to say that once again, it is ironic that the simplest concepts are often the most complex. The definition of ZERO? is a good example.
The original definition of ZERO? was designed to complement POSITIVE? and NEGATIVE?. It was designed for signed scalar values (e.g. integer, decimal, money), as in:
i: to-integer ask "Enter a number?"
if zero? i [...]
If you used ZERO? for strings or other non-scalar datatypes, it caused an error.
But then, after a few years, I found that I wanted to be able to test if any value was zero. After all, this condition works fine:
if val = 0 [...]
even when val is not scalar. If val is "hello", then we know that "hello" is not zero, so the above condition is false.
It seemed logical that ZERO? should work the same way:
if zero? val [...]
But, in R2, if val is a string, then ZERO? throws an error, and your code blows out.
It seemed reasonable to allow ZERO? to accept any datatype and be the equivalent of the val = 0 comparison case. So, I allowed it.
Good, now I can check any value for zero, and I was happy for a while. Zero had a dual-test: it checked both datatype validity and if scalar, the datatype value.
As a result, I started using a lot more zeros in my data structures, especially those that had to be loaded from a file or database, because I knew that zero loaded more efficiently than NONE (no hash was required).
For example, where I would once create REBOL-stored DB records like this:
["name" none none]
I would use:
["name" 0 0]
Because using 0 does not require a hash-and-compare operation (as does the word NONE) nor does it require binding (as does the value NONE)... nor is it the long-form literal #[none], that seems just a bit too cumbersome to my coding style.
This may not be important for loading a few hundred records, but if you load a few thousand records, well, I just have to believe it makes a small difference. (And, certainly, feel free to prove me wrong.)
Also, I began thinking about "pseudo-scalars". For example, it seemed reasonable if 0:00 (a time) and 0.0.0 (a tuple) would be true for ZERO? Right? Ok, so let's allow that.
But, that leads down a longer road... doesn't it? Is a BITSET that contains only zeros true for ZERO? too? What about a BINARY, VECTOR, or even an IMAGE?
So now the definition of ZERO? begins to extend into the SERIES datatypes. The operation appears useful, but as I think more deeply I also wonder how likely is it for us to actually make that kind of test for zero? If we never really do it, then we don't really need to support it.
And, if we open ZERO? to be valid for all series, now we also have the case where a string may in fact be TRUE for ZERO? It would be a string of nulls.
So, at this point it's best to stop and re-examine the purpose and definition of ZERO?. What do we really need and what should it really do?
For example, if we find another way to make NONE more efficient for large data sets, does that change how we use 0 for such cases?
A bitset can only contain one zero, or none. If a bitset contains a zero then the first bit is set - that is the bit that corresponds to 0 or #"^(00)". If the underlying representation doesn't have any bits set, the bitset is empty, not zero. A bitset is a set.
I think that ZERO? should only return true for numbers with a value equivalent to 0. I'm not sure whether you would extend that to time or tuples, but definitely don't extend it to series types. Limit ZERO? being true to values you can do math with.
If you can figure out how to make NONE more efficient, please do. I use it all of the time and many more functions in R3 are being changed to generate or work with NONE better.
Men use words to describe their known universe and the things they invent to measure their universe. Doing so helps make that universe known to them.
There is such a thing as the numeral nul (zero, if you speak Italian Latinate) and there is such a thing as the number nul -- a measure of no thing.
There is such a thing as 0:00 (start time).
Yet, 0.0.0 never means nul (zero). This is a positive numeral "English word" used to label a thing. A bitset with all nuls is a positive bitset since nuls here are numerals and not numbers.
A programmer should be able to make tests for truth for numbers that are nul, money that is nul (too bad if that is your bank account), time that is nul.
Yet, no true meaning happens if a programmer could test for nul as numerals in a tuple or a bitset.
The design of all programming languages ought to happen based upon men being able represent their known universe. Otherwise, you end up designing a way to confuse yourself more than just handwriting things and ditching the computer altogether.
What about a short, simple literal representation for NONE then? If that's what you actually want... Maybe #0 ?
I'd still keep ZERO? just for scalars and at most pseudo-scalars (zero? on a time makes perfect sense, on tuple maybe).
I agree the road of other options besides scalars isn't a good one. I like Gabriele's suggestion of solving the real issue of NONE being a little less performant in something that can be a common value representation issue. |
The problem with NONE not being that efficient is really more of a syntax problem. The serialized representation #[none] doesn't have the word lookup overhead, but it may be too awkward to type, and is perhaps ugly looking. If there were a better syntax than #[none] then people would just use that, efficiently.|
ZERO? for numbers only. Easy to understand, no guesses involved.
Using 0 instead of none sounds like a bit of an over-"optimization". If you're _really_ concerned about speed, why not do a binary version of REBOL data? (And if you're not willing to do that, I'd venture you're not _really_ concerned about speed ;-).
>> Zero? 2008-01-01 |
** Script error: cannot use zero? on date! value
keep zero? for scalar only, adding pseudo scalars like tupple and vectors, maybe. in any case, the extended definition should be documented explicitely.
there should be another test which functions exactly like python conditionals. Its creates very clean code, and the type supplies its own representation of true or false.
basically, it would be the functional equivalent of:
attempt [true? data]
attempt [not zero? data]
attempt [not empty? data]
although it should be coded another way for speed, this is the most readable way to code it now.
why? you can use the SAME word for practically all of your "this data is or contains a value which has meaning", which is the vast majority of simple conditionals use.
obviously, if your algorythm expects specific types your data should already be aligned, but many times, we only need to enter a function if there really is something to set or change.
some names for this function could be:
?? has? is? contains? meaningfull?
initialized? usefull? filled?
the logic! type testing always seems to hinder a universal naming for this function, but the idea is there..
Post a Comment:
You can post a comment here. Keep it on-topic.