The ZERO? question

I have to say that once again, it is ironic that the simplest concepts are often the most complex. The definition of ZERO? is a good example.

The original definition of ZERO? was designed to complement POSITIVE? and NEGATIVE?. It was designed for signed scalar values (e.g. integer, decimal, money), as in:

i: to-integer ask "Enter a number?"
if zero? i [...]

If you used ZERO? for strings or other non-scalar datatypes, it caused an error.

But then, after a few years, I found that I wanted to be able to test if any value was zero. After all, this condition works fine:

if val = 0 [...]

even when val is not scalar. If val is "hello", then we know that "hello" is not zero, so the above condition is false.

It seemed logical that ZERO? should work the same way:

if zero? val [...]

But, in R2, if val is a string, then ZERO? throws an error, and your code blows out.

It seemed reasonable to allow ZERO? to accept any datatype and be the equivalent of the val = 0 comparison case. So, I allowed it.

Good, now I can check any value for zero, and I was happy for a while. Zero had a dual-test: it checked both datatype validity and if scalar, the datatype value.

As a result, I started using a lot more zeros in my data structures, especially those that had to be loaded from a file or database, because I knew that zero loaded more efficiently than NONE (no hash was required).

For example, where I would once create REBOL-stored DB records like this:

["name" none none]

I would use:

["name" 0 0]

Because using 0 does not require a hash-and-compare operation (as does the word NONE) nor does it require binding (as does the value NONE)... nor is it the long-form literal #[none], that seems just a bit too cumbersome to my coding style.

This may not be important for loading a few hundred records, but if you load a few thousand records, well, I just have to believe it makes a small difference. (And, certainly, feel free to prove me wrong.)

Also, I began thinking about "pseudo-scalars". For example, it seemed reasonable if 0:00 (a time) and 0.0.0 (a tuple) would be true for ZERO? Right? Ok, so let's allow that.

But, that leads down a longer road... doesn't it? Is a BITSET that contains only zeros true for ZERO? too? What about a BINARY, VECTOR, or even an IMAGE?

So now the definition of ZERO? begins to extend into the SERIES datatypes. The operation appears useful, but as I think more deeply I also wonder how likely is it for us to actually make that kind of test for zero? If we never really do it, then we don't really need to support it.

And, if we open ZERO? to be valid for all series, now we also have the case where a string may in fact be TRUE for ZERO? It would be a string of nulls.

So, at this point it's best to stop and re-examine the purpose and definition of ZERO?. What do we really need and what should it really do?

For example, if we find another way to make NONE more efficient for large data sets, does that change how we use 0 for such cases?

8 Comments