Comments on: Vote: should UNSET act as a value?

As R3 moves toward finalization, we have some decisions to make.

VOTE NOW: Should the unset! value be ignored by certain control functions?

Note that unset! means "no value". For example, a variable that has not been assigned a value is unset or a function that calls exit returns unset.

For some functions, such as the logical control functions all and this question is important. For example, if we have a function:

test: func [a] [print a exit]

and it is used here:

if any [test 123] [print "ok"]

The test will return the unset, but how should any handle it?

There are three choices:

The unset value is treated as a non-none value, so ok is printed.
The unset value is ignored, so it's like any [], which returns false, and ok is not printed.
Unset is not valid, and an error is thrown.

Please think about this relative to your experience and code you've written. What's the most practical result?

Vote for 1, 2, or 3 in the comment section.

27 Comments

Comments:

Brian Hawley
27-May-2009 17:09:28 I vote 2. Unset values should be like noops for ANY and ALL.
Oldes
27-May-2009 17:16:14 I vote for 2 as well. If something is unset (not existing), it cannot be ANY.
Henrik
27-May-2009 17:36:23 BrianH, are you sure 2 means a noop? What Carl writes is that the ignored value would be equal to ANY [], which means:
any [() 123] = any [] ; true any [] = none ; true

I would not like that.
It seems more like 1 is a noop.
I vote 1, because that's how I would use it, and 2 would violate the purpose of the IF ANY construction, creating two points of failure instead of one, because you would then have to figure out whether 'test failed or 'any failed. With 1, you don't have that worry, but it could again be a feature of the IF ANY construction.
3 is a possibility, but I think it's too strict.
Brian Hawley
27-May-2009 17:41:50 Henrik, the test function in the example takes one argument, and at call the argument is the 123.
Brian Hawley
27-May-2009 17:43:11 And the test doesn't fail - it returns nothing, which is not the same thing as failure.
Maxim Olivier-Adlhoch
27-May-2009 18:01:56 Definitely #2
but it should be ignored in both 'ANY and 'ALL.
i.e doesn't trigger a match or match-failure.
in R2: we have to explicitly ignore unset! which is annoying:
any [(print "ok" false) "matched"] all [(print "ok" true) "matched"]

cause we'll never use the unset! as a return value anyways.
Henrik
27-May-2009 18:13:47 BrianH, I missed the argument bit, but the choices above makes it not possible to tell the exact behavior of ANY with multiple values:
any [() 123] == ; unset!/123

By failure, I meant simply returning false or nothing.
I agree with Maxim about ignoring unsets and just carry on, but what if the last value is unset for ALL?:
all [123 ()]
Brian Hawley
27-May-2009 18:53:21 Henrik, in the first case:
>> any [() 123] == 123 >> any [123 ()] == 123

In the second case:
>> all [() 123] == 123 >> all [123 ()] == 123

The only failure of the example function was calling the function test when it doesn't return anything (or the REBOL equivalent, returning unset!). Unset values are not the same thing as none.
Paul
27-May-2009 23:55:26 I vote for option 3.
rebolek
28-May-2009 2:33:38 I vote for number 2 (and hope this post is long enough to get in ;)
Reisacher
28-May-2009 3:06:27 I vote for 2. Handle it like a noop.
DocKimbel
28-May-2009 4:51:34 I vote for #3.
#1: doesn't sound logical to me
#2: could be handy in some ANY/ALL expressions, but I'm afraid that it would lead to silent programming errors cause by implicit returned value. Moreover, if #2 was implemented, some typo errors wouldn't be detected anymore like :
>> test: func [a] [print a exit] >> if any [tst 123] [print "ok"] ; typo error in 'tst ok ; result can't be trusted

So #3 (current R2 behaviour) looks like the less risky option, forcing the programmer to explicitly return a value. It's the most practical to me because it leads to the most robust code.
Andreas
28-May-2009 4:57:32 #3, same reasoning as Doc.
meijeru
28-May-2009 7:37:15 #3 for two reasons (probably the same ones :-)
- compatibility with R2
- safety
Robert
28-May-2009 7:51:56 #3, same reasoning as Doc. Tracking down implicit things like this is horror in big code bases. See code that explicitly handles this case makes it clear: It's intended.
Brian Hawley
28-May-2009 11:08:47 Doc, your typo example would generate an error on eval. You have to use get/any 'tst to have #[unset!] be generated. Good point though.
Part of the trick here is that you have to balance the benefits of error generation versus the overhead of error handling. We've been careful to make a lot of code that used to generate errors in R2 just work in R3. If there is a good rationale for "just working" that is. Sometimes the supposedly erroneous behavior is so common that the overhead of wrapping it in ATTEMPT is too much, as it was for the ordinals (FIRST, ...).
If the errors that would be generated are so important that they are worth complaining about, and only if, then let's generate the errors.
Anton Rolls
28-May-2009 13:04:58 Dockimbel makes a good argument, but I'm (somewhat cautiously) more with BrianH on this. I would need to think about it more to be sure.
Brian Hawley
28-May-2009 14:26:07 I'm actually OK with either option 2 or 3, but there are tradeoffs to either.
Which reminds me: Which built-in functions in R3 return unset! values? I can only think of PRINT and PRIN, off the top of my head. WRITE returns a value in R3. Even ASSERT returns true.
If there are no functions than just PRINT and PRIN that return #[unset!], it might be that the error case is more frequent in ANY and ALL. Back in January when option 2 was suggested, the first thing that came to mind was debug PRINT statements, but that was back before we got ASSERT. The () case is arguably an error, as it only shows up in test code where people want to generate an unset! value.
Now I would normally be for treating #[unset!] as a noop in control contexts, just like it is in DO. However, if #[unset!] is generally the result of a code error in ANY and ALL, then I gotta go with option 3. At least it's consistent with EITHER, IF, UNLESS and CASE, the other conditional control structures.
Cyphre
28-May-2009 16:19:08 I vote for 3 according to the rule of the 'least surprise' - no 'automagic' results! In other words yielding an error will force developer to put correct returned value to the called function. It seems to me ilogical to use function with return value of type unset! in ANY or other constructs. If you go with 1 or 2 it means to me unset! doesn't have any sense other than make things more complex so it can be removed for the Rebol datatypes family. Also it seems to me that chosing 1 or 2 makes Rebol close to the 'dark side of Javascript' ;-)
BookSiberia
28-May-2009 22:51:03 Why not allow the default behavior to be a la Rebol 2, but allow a refinement to any (i.e.,
any/ignore-unset
)
to enable option 2, and perhaps even
any/treat-as-no-none
to allow for the first case.
pros and cons, anyone? Can this be done? SHOULD this be done?
BookSiberia
29-May-2009 1:34:02 Okay, quick follow-up here.
I probably need to rephrase (and get it right):
any/unset-invalid
any/unset-ignored (default case, so any with no refinement behaves similarly)
any/unset-valid
It seems to me that using refinements to any not only allolws the programmer maximum flexibility, but it is self-documenting.
Brian Hawley
29-May-2009 12:40:02 Flexibility has a price in REBOL: Refinement processing has overhead. ALL and ANY are low-level, efficient functions. If we add refinements to them the processing overhead would add a significant percentage of additional time to each call.
Normand
29-May-2009 14:26:28 I prefer not substitute empty data value and noop, and would prefer not to equate unset! to non-none value because involution of negation occults modality (possible versus necessary).
-- The 'none and the 'unset values (empty data values, possible versus necessary)
Compared to 'none, the word 'unset also has the semantic of a zero. In arithmetics, we distinguish the absorbant and neutral value (0 times 3= 0, 1 times 3 = 3). But arithmetics do not distinguish modality (discern categorical quantification from modal quantification.
I do not use 'none and 'unset to express a no effect operation:
- 'none is used for a possible not yet set data value (to mean an optional value) where we are indiferent to the determination (or not) of such value, (example: having a pet in a persons database)
- 'unset is used for a not set yet but necessary value (example the person name and surname in a persons database)
On such a semantic, I would prefer
any [1 2 3 none!] yelds 1 (first not false) and any [1 2 3 unset!] yelds 1, as usual
but
all [1 2 3] yelds 3, all [1 2 3 unset!] yelds unset!, either [all [1 2 3 unset!]] [print "ok"] [print "not ok"] yelds ok,
I would prefer that the condition evaluates to false (not ok), contrary to either [all [1 2 3 none!]] [print "ok"] [print "not ok"] that yelds "ok".
-- Noop (the empty operational value)
Like 'noop, I use a word like 'gracefullend to mean the result of an operation normally returning no value, susceptible to mean an operation that has no effect on ulterior operations, undiscriminately repeatible.
But, for robustness, I prefer to distinguish a no-effect operational end of a silent operator from a no-effect end of a disruptive operator:
- 'gracefullend: a silentreturn, where the operation return no value when it is designed to return no value, (like a return 'true in the context of an operation that is not a form of exception), ie. gracefullend: true
- 'ungracefullend: the contradiction of the precedent noop operator, to mean that the case of an ungracefull (unsecure) ending of an operation was perpetrated and treated accordingly, ie. ungracefullend: false
A gracefull operational end may bear a neutral effect:
any [1 2 3 gracefullend] yelds 1, returns first non false. either any [1 2 3 gracefullend] [print "ok"] [print "not ok"] yelds ok and it is an intuitive behaviour.
all [1 2 3 gracefullend] yelds true return last, the value of gracefullend. either [all [1 2 3 gracefullend]] [print "ok"] [print "not ok"] yelds ok.
But an ungracefull operational end should (to me) bear an absorbant effect.
any [1 2 3 ungracefullend] yelds 1, where any one regular end suffice.
all [1 2 3 ungracefullend] yelds none at first false. But it a state we want to treat as a disruptive (necessary, ie. neither a normal speaking operator nor a normal silent operator). either [all [1 2 3 ungracefullend]] [print "ok"] [print "not ok"] actually also yields ok. It seems counterintuitive.
-- My understanding of the question
Question 1: none and not-none (unset!). For data should we discriminate between optional and mandatory values.
Question 2: For the operations with normally returning no result (noop), should we distinguish noop and non-noop. It is unlogical to equate them.
For any, with some other values, none, unset! and noop is interpreted as true, yields 1, but alone unset! would yield false. But any [1 2 3 non-noop] should bear an error ('noeffect return of a disruptive operator).
For all, in data values none should be interpreted as vacuously true, unset! as necessarily false. It is a lack of necessary value.
For all, noop could be evaluated as is true, because an empty operational effect is simply redundant.
But non-noop should yield not-valid, and throw an error (not the normal noop).
onetom
30-May-2009 16:58:41 #3, because of the "least surprise" principle.
i was also pondering about what would happen if we just remove the unset! datatype (according to cyphere) but normands reasoning made me uncertain :/
however we can simulate the "necessary but not yet specified behaviour" quite easily in task specific ways, like notset!: #-1 notset?: func[v][v = notset!]
btw, why cannot print return the printed value or true or none or something? practically nothing else uses unset! apparently, right?
Endo
3-Jun-2009 11:18:23 #3 is good, because catching an error is easier than tracking the code to find the illegal values.
Revolucent
3-Jun-2009 13:55:53 I prefer option 2. I think in addition it should be possible to assign unset! to a word, unlike R2. This makes it possible to write cleaner code:
unless all [not unset? value: foo/bar value > 3] [ print value ]
If I try this in R2 and the return value of foo/bar is unset!, it fails. I deeply dislike that. Perhaps someone knows a better idiom.
Brian Hawley
3-Jun-2009 18:37:36 Revolucent, that failure is the whole point of unset!. It is supposed to not be a value, so assigning it is supposed to be an error in the default case. It's a placeholder for uninitialized-but-bound variables.
Here's an alternate to your code:
unless all [not unset? set/any 'value foo/bar value > 3] [print value]
The whole point of SET/any is to set unset! or error! values, which is normally supposed to be an error.

Comments on: Vote: should UNSET act as a value?

Comments:

Post a Comment: