Comments on: Scant Evaluation - Should it be Extended? [No]

(See below for updated information.)

REBOL 3.0 defines three methods of evaluation:

	Evaluation	Normal evaluation of all functions. This is just your regular REBOL code.
	Safe Evaluation	Evaluation based on a subset of functions that are secure. Safe evaluation makes it possible for blocks like `draw` to do more complex types of evaluation, but without the risk of security problems. This is a new mode of evaluation, but I will be explaining more about it later (not in this article).
	Scant Evaluation	A minimal form of evaluation used for headers and other data blocks that do not allow any level of deep evaluation.

This last method will be the focus of this article. You are familiar with scant evaluation when you load a file that contains a REBOL header and also when you use the construct function to import object data safely. Scant evaluation is really just another kind of REBOL dialect.

For example, using construct, this block is made into an object without any side effects:

obj: construct [
    author: "Smith"
    when: 10:20
    action: delete
]

The object's action variable is set to the word delete, and the delete function is not evaluated. This is a handy notation for storing data in external (serialized) objects because it eliminates the possibility of "malware" lurking in your objects.

However, what happens if the block above contains values that have no matching set-words? For example, what would happen with:

[
    author: "Smith" "Jones" "Davey"
    when: 10:20 past last edit point
    action: delete on request
]

Normally (in REBOL 2.0), each set-word would get set to only the first value that follows it. However, would it be useful to allow the scant evaluation dialect to collect all values that follow a set word? When more than one value appears, should a block be implied?

Molding the above object would produce:

[
    author: ["Smith" "Jones" "Davey"]
    when: [10:20 past last edit point]
    action: [delete on request]
]

Of course, we'd provide a refinement for molding scant output, so your objects and headers can be serialized back to the scant format.

I think now you can see how scant evaluation could come in handy for REBOL headers and other types of objects. However, you know that I don't just throw "anything" into REBOL without a lot more deep thought and discussion regarding its merits. So, post a message and tell us what you think.

Note: this feature is already internal to the method REBOL uses for header evaluation; so, bringing it into the language as a formal method does not expand the REBOL executable by much. It's nearly free.

Additional Notes (Added 16-Aug-2006)

Many good points have been made in the comments section. Thank you for taking the time to post them. I agree with the general consensus that perhaps this form may go too far beyond normal REBOL, and as a result may cause difficulties for users. It is a little "out there", but I wanted to gather your opinions before dismissing the idea.

But also, from some of the comments, we should be sure we're talking about the same thing. To help clarify the proposal: Headers are already dialects and this form is mainly intended for explicit uses related to the secure instantiation of objects. In other words, a script header cannot be fully evaluated, because it is automatically evaluated when a script is loaded (to provide information to REBOL itself). So, it must be made secure. For example, in headers (and in the construct function) both word lookup and function evaluation are disabled, but set-word (assignment) is not.

Of course, it is also true that if you want such behavior as described above, it's easy to implement yourself. With that in mind, and from your comments, we will leave this concept to the implementation of the user, and not part of REBOL 3.0.

12 Comments

Comments:

Anton Rolls
17-Aug-2006 16:40 At first glance it feels wrong, because afterwards you can't tell if it was originally a block or separate values were made into a block. But your example seems to make sense. What would molding with that refinement produce ?
Ingo
16-Aug-2006 4:13 To me, making it into a block makes more sense. You can always mold the block yoiurself, but using going back to a block from the string may really lose information. Maybe a refinement to use one or the other? It may be that different apps would be better served by different ways.
Another question, what happens with embedded commas? Like
[ author: "Smith", "Jones", "Davey" when: 10:20 past last edit point action: delete on request ]
** Syntax Error: Invalid word -- , ** Near: (line :) author: "Smith", "Jones", "Davey"
I've always felt that Rebols unforgiving handling of commas is a major shortcoming, given how often they are used in the rest of the world.
Ingo
16-Aug-2006 4:14 ( ... this smiley should have been an eight, followed by a closing paren ...)
:)
Ladislav Mecir
16-Aug-2006 6:44 I don't want REBOL interpreter to know better than me what I wanted to write. I suggest to pick the simplest possible alternative, i.e. ignore the following words - if I want to get a block I can use a block.
Henrik
16-Aug-2006 9:06 I don't know if I like it, but couldn't it be done with a refinement or a separate function? It seems very handy to me, but I think it would be worth investigating other patterns of block elements, as it wouldn't be easily obvious that you can do this. You might also do it accidentally and not knowing what is happening.
You could have a COLLECT function to do this before using CONSTRUCT so you can specifically say, you want to construct the object this way.
Cyphre
16-Aug-2006 9:58 I agre with Ladislav. I don't see any problem to use block in such cases.
Jaime
16-Aug-2006 10:59 I echo Lasdislav's position. Beacuse it introduces another form to express structure. And one that could easily break programmer assumptions.
The construct could return either a value or a block of values and the programmer then needs to validate each slot for a particular type.
Maxim Olivier-Adlhoch
16-Aug-2006 12:16 I also think that using a block around values is very easy to add manually.
And Like Jaime, consistency in rebol is paramount to its beauty. this "shortcut" will inevitably cause headaches and assumptions will be made to its use in contexts where its not applicable.
But I can see where this could make constructing datasets programatically much easier to handle. Alleviating the need for subblocks to be created, handled separately then "inserted" in the master block.
so I second Henriks proposition that a different native be added ('COLLECT seems appropriate) which explicitely does extended scant evaluation. and maybe a mold/collection refinement to complement it.
Carl Sassenrath
16-Aug-2006 16:29 I will post some additional notes to help clarify.
Carl Sassenrath
16-Aug-2006 16:49 Refresh your view of the article to see some additional notes based on your comments.
Now, with regard to commas... ah, now that's an interesting statement above. Perhaps it's worth an article on its own, eh?
Maxim Olivier-Adlhoch
25-Aug-2006 16:17:26 commas could simplify matrices expression:
[1 2 3, 4 5 6, 7 8 9]
instead of:
[[123][456][789]]

maybe forcing length symmetry between all sub blocks as part of a new matrix! datatype.
make matrix! [0 : "X" 1 2 3, "Y" 4 5 6 , "Z" 7]

"0 :" being the default value for unfilled data.
(the last row would return ["Z" 7 0 0])
Carl Sassenrath
31-Aug-2006 15:11:13 Back in 1997, when designing REBOL, I considered commas for many months. It was a difficult decision to exclude them, but I think it was the correct design decision.
Here are some important questions to consider:
1. Is the comma "syntax" or "semantic". (That is, can your operate on it from within REBOL code itself.)
2. Is comma a word?
3. What happens to non-british decimal values if comma is allowed? E.g. currently 1, = 1. and 1,2 = 1.2
4. How is the comma stored in the block?
5. Is it an element of the block? If I write FOURTH on your above block, what do I get back?
6. If I sort or otherwise modify the block, what happens to the commas?
Years ago, it was my conclusion was that the concept of comma as a separator was inconsistent with the design of REBOL. This is fundamental in REBOL's design. REBOL contains no separators (only expressions).
It turns out where I personally miss comma the most is as a functional expression terminator (e.g. like a semicolon in most languages). It would allow functions to have optional arguments:
list-dir, help, print "done"

But in such cases, the comma is simply a word. It is semantic but with a special lexical precedence to decouple it from prior characters (like a delimiter).

Comments on: Scant Evaluation - Should it be Extended? [No]

Additional Notes (Added 16-Aug-2006)

Comments:

Post a Comment: