REBOL 3.0

Comments on: Self Reflection

Carl Sassenrath, CTO
REBOL Technologies
26-Apr-2010 3:20 GMT

Article #0312
Main page || Index || Prior Article [0311] || Next Article [0313] || 49 Comments || Send feedback

Within a context, the word self refers to that context.

For example, within an object:

obj: object [a: 10 f: does [probe words-of self]
probe obj/f
[a f]

The self reflection shows words a and f. But, you may ask where is the word 'self?

Another example is a loop context:

repeat n 1 [probe words-of self]
[n]

where the same situation occurs.

You should know that self is a special word with an automatic binding to the context of its reference. It's indeed one of the few keywords of the REBOL language, and because self is implicit to each context, its value is not reflected in a mold of the context (as you can see above.)

Now, we must ask how flexible will we allow self usage? Should this code be allowed:

object [self: 10]

Wouldn't such usage be nonsensical? It seems so. We cannot set our context in this way, so the expression is meaningless. However, if we think of self as just another word, then the expression should be valid, shouldn't it?

In fact, one might argue that if we disallow self as a variable then we are creating a singular irregularity in the "name-space to value-space continuum". Mathematically, that sounds bad like the universe may implode, but practically speaking, who cares? Most languages are not built solely on mathematical purity, nor are purely mathematical languages all that practical in real life programming.

At the core of the issue is the fact that self must be bound to the context, but we don't want to pay that price for every context, because 99% of contexts don't refer to self at all. In R3, we solve this problem by allowing self to be a special word that has no value storage. This is an optimization where we get self almost for free.

Now... with that said, in the current version of R3, you can write:

repeat self 10 [print self]

This proves to be is an interesting situation because the context consists of two identical words: self the context and self the loop variable. It's questionable if this should be allowed, but on the other hand is such overloading really a problem? If we don't want to allow it, then would the above expression throw an error to indicate that self cannot be used in this way?

49 Comments

Comments:

Maxim Olivier-Adlhoch
27-Apr-2010 0:41:44
I've used 'SELF as a variable when building class based OO mechanisms in R2. I don't know how this affects your decision, but I guess an example may help you chose.

ex:

my-class: context [
    name: none
    class: context [
        display: func [self][print self/name]
    ]
    ;stubs
    display: does [class/display self]
]
user: make my-class [name: "max]
user/display

for such a small object, it seems pointless, but because the class methods within aren't copied and bound for each instance, when objects and apis become largish, this can lead to 100 times less memory and over 200 times better performance when building larger datasets.

Stubs are very small and require very little binding. Rarely more than a few words each. The class methods on the other hand, which actually have code in them, might have hundreds of words each. when you've got upwards of 15-20 functions in such an object, I've measured that allocating a single object can hit 20kb of RAM, this adds up... REAL FAST.

With the class/stub example above, we can hardly measure the impact of the object.

Although the stubs do provide a small performance hit, in many tests I've done this becomes minute when compared to the GC and binding which ISN'T done.

people will say this isn't rebolish... well, I'd say its VERY rebolish... with just a few lines of code, I've implemented class based OOP myself, without added clutter. How many languages make it as easy to change such a fundamental semantic when its advantageous to your problem?

I'd hate to lose this kind of natural symmetry in the language.

just my two cents.

Brian Hawley
27-Apr-2010 1:34:23
Maxim, it seems to me that you could implement your class-based oop using the word this or something, and it would have the advantage of allowing you to refer to both the instance variable (this) and the class context (self) in the class functions. So using self for that purpose is the worse choice. Other than that, Maxim, that is a nice trick, one that the R3 GUI does as well (without using the word self).

Now as for whether self should be treated specially: For object, module and script contexts it makes sense to think of self as a predefined word with certain restrictions. The current restrictions could probably use a bit of rebalancing though - there's a half-dozen tickets to that effect.

I think there would be fewer complaints if there were some way to override the self implicit context variable with a self field, at least for the contexts associated with function types and loop functions. Not fewer people complaining: Only a few (exceedingly smart) people have complained about this at all, but they made a lot of complaints, a flame war even. R3 can do this now, if awkwardly and with a few fixable errors. Should it?

However, whether or not we get rid of the self override, we need to not do the binding trick for the self context variable in the code block of loop functions (the only current R3 functions that create an implicit context are all loop functions). There is no use for self to refer to those contexts within the code block, at least no use that can't be solved with bind?, and rebinding self makes these functions less useful inside objects, modules and scripts. We know that this binding trick can be turned off because it was for the bodies of closures - we just need to turn it off for loop functions as well.

There is also the question of whether bind and in should continue to bind the self context variable by default when explicitly called, or whether it should be made a bind/self option. We know that if it is an option, that option would be used by certain significant low-level mezzanine code, so the behavior at least needs to be made available. But some people have said that the current binding behavior is too advanced to be the default, so it should be optional.

-pekr-
27-Apr-2010 1:37:55
I don't understand, what the question is :-) I also don't understand Max's worry - noone is suggesting we are going to loose 'self word, no?

If you ask about overloading - 'self should not be treated like a special word, even if it is implicit and hidden. If some user does:

print: none

... then he just lost the 'print function. So if someone overloads 'self, he looses 'self. But - otoho I would not mind 'self being protected, or shielded by error, stating e.g. 'self is a special keyword.

It all depends, what difficult to track situations for user might it possess to allow 'self overloading. But then - each reboller was surely caught by redefining regular function, etc., so I am not sure I worry that much, unless it has consequences I am not able to foresee ...

ridcully
27-Apr-2010 2:01:10
I'm only following REBOL theoretically, but maybe it would be a possibility to have a function named self or currentContext that would always return the current context? This way you wouldn't have to introduce a (singular) keyword to REBOL.
Brian Hawley
27-Apr-2010 2:45:48
Ridcully, you bring up an interesting theoretical point: No, that would not be possible, because there is no such thing as the "current context" in REBOL. Lexical scoping is faked, as is dynamic scoping. Blocks of code don't have any context at all, though they might contain words that each have their own contexts. REBOL "contexts" aren't really contextual (sorry).

REBOL has direct binding of word values to context values. Binding is an action, performed at runtime during the definition phase of objects and functions, at start time for some loop functions, or explicitly with the bind or in functions. "Nested" bindings are just overridden by subsequent rounds of the binding process. So if you have an object with an "inner" object that has a field of the same name, references to that field are just bound twice, the second binding overriding the first.

The binding (noun) of the self word is figured out at binding (verb) time, and after that just read from a pointer reference. It can't be figured out from "context" at runtime, because there is no context beyond a single pointer inside each word value.

I hope that makes sense to you, because it's a hard subject that others have done a much better job of explaining. Look at Ladislav's Bindology articles for more info.

Ladislav
27-Apr-2010 3:15:03
Max, your example contains an error (the my-class/class/display function cannot work), as far as I can tell
Ladislav
27-Apr-2010 3:26:46
As far as I am concerned, I do find acceptable to have a 'self keyword in object specification dialect.

On the other hand, I do not find the 'self keyword acceptable in function specification dialect (bug #1528), neither I find it acceptable in cycle blocks (they can be seen as dialects too - see bug #1529).

These problems look like "circumventable" by using a "'self binding exception", as BrianH suggested, but, that leads to a new problem, that I would dare to call "BIND crippling" (see bug #1549).

To have a clean way how to support the above mentioned dialects without the 'self keyword (I see these dialects as "the majority"), it looks, that we need contexts not handling 'self as a keyword.

Ladislav
27-Apr-2010 3:36:37
When comparing the seriousness of the above mentioned bugs, I see the bug #1549 as the most serious one.

That means, that I would rather accept 'self being a keyword of the function spec dialect and of the cycle control functions, than to invent any "work-arounds" crippling BIND.

Gabriele
27-Apr-2010 5:42:15
I personally only see two "sane" ways to handle this:
  1. BIND always binds 'self to the target context. (Maybe, with the exception of function!, because it is a special kind of context.) This will surprise users (eg. 'self in a closure in an object does not refer to the object), but you can reduce surprises with good documentation.
  2. BIND only binds 'self if the context is an object! context and in no other cases. (We need to get the ability to make non-object contexts easily for mezz loop functions and so on, maybe closure! is enough, though I'd rather allow make context!.) That is, object! and self are the exception; this is probably less surprising for users.

I agree with Ladislav that having refinements in BIND to solve the "surprises" of (1) is a bad idea.

ridcully
27-Apr-2010 7:38:42
Brian, thanks for your explanation. I already supposed that I wouldn't be the first to come up with that idea if it would have been possible in REBOL.
Brian Hawley
27-Apr-2010 13:37:52
Gabriele, there's a problem with this: "BIND only binds 'self if the context is an object! context and in no other cases."

Actually, we need the self context field for module and script contexts too; self is used in places where system/words would be used in R2 - it's the best way to get a reference to the current script or module context.

"That is, object! and self are the exception" - No, since all object, module and script code blocks would need to support it, it would be about 50/50. But not being able to bind self when you expect to be able to would count as a shocking situation.

I think the best way to resolve the "surprise" factor would be to make the binding behavior consistent based on the category of activity. The basic premise would be that self is not a field, but a keyword, and then limit where that keyword is used to very predictable circumstances.

  1. When you are defining an object, module or script, bind self to the context. We need the keyword in these. The self binding would act like a protected field to the code there, but wouldn't be a real field at all - it would be a well-documented keyword.
  2. When you are defining a function, closure, any function type, don't bind the self keyword, or prevent them from using 'self as a parameter. Principle of least surprise.
  3. When you are running some function that creates an implicit context at runtime (i.e. some loop functions), don't bind the self keyword, or prevent them from using 'self as a variable. To do otherwise would confuse people, and limit access to the self bindings for objects, modules and scripts (1).
  4. When explicitly binding words or blocks of words with bind or in, don't treat self as a keyword. If there is a 'self field you can bind that. The source of the context you are binding to is irrelevant: No surprise binding of self allowed.
  5. We would need to have a bind/self option to enable the self keyword for explicit binding, because the binding of scripts and modules done in (1) is done by mezzanine code. An error should be triggered by bind/self if the context you are binding to has a 'self field - this will never happen in the mezzanine code. No surprise binding of self to a 'self field with bind/self - it needs to be bind to the context every time, predictably. Code depends on this.
  6. Leave in the append workaround for 'self fields, and allow conversion from maps with 'self fields to create objects with 'self fields. Count on the error handling of (5) to keep you safe.
  7. Perhaps the construct function should have the self keyword by default, along with its other header-friendly tricks. If so, the construct/only option (that we already have) would turn off the self keyword like it does the other tricks, and allow 'self fields.

This would have the effect of removing R2's 'self field altogether, and just making self a keyword of object, module and script blocks, or explicitly applied. All other binding situations would not use the keyword. There would not be an implicit self field for objects anymore - it would be something that is done, not something that is there. And no surprises.

Brian Hawley
27-Apr-2010 14:41:34
According to the proposal in the previous message, these functions would use bind/self (bind with the self keyword): intern, system/intrinsic/make-module and make object! (calling the internal equivalent). These functions call intern: load (but not load/unbound), system/intrinsic/do and system/intrinsic/begin (for the --do argument). This means that almost all code loaded into R3 would have bind/self applied to it at least once.

Everything else that needs to do binding (including functions, loops) would just call bind without the self keyword (or the internal equivalent). This means that most code (including Ladislav's do-in) won't apply bind/self again, so the code will still have the existing self keyword binding left over from when it was loaded into R3.

This seems like the best balance to me.

Carl Sassenrath
27-Apr-2010 14:42:46

Thank you for rapidly posting your comments. I'll read them over and propose some kind of solution for A98.

Yes, it is possible to remove the implicit self from loop contexts; however, there will always be cases of nested contexts in code where references to outer contexts require an explicit variable defined in those contexts (e.g. myself: self). And... users need to know how to do that; it's quite simple, no different than having names for objects or functions. It's a standard method of reference.

Ladislav
27-Apr-2010 15:49:03
Please, don't keep the bug #1549. If you decide to keep 'self as keyword for all contexts, then, please, don't remove 'self from loops or closures.
Brian Hawley
27-Apr-2010 17:26:42
Ladislav, we were both wrong about #1549. The self keyword isn't a feature of contexts, it is a feature of bind. R3's contexts don't have a self field, not really, whether they are objects or not. The 'self field has already been removed from all contexts, including objects. The self keyword is not a field, it is a (potentially optional) bind artifact.

It's not like R2, where object contexts had a 'self field, loop contexts didn't, and all object code assumed the that the first field was 'self, which broke for loop contexts. As your do-in code demonstrates.

Carl Sassenrath
27-Apr-2010 18:30:40
Ok, first, here's where we'll document self:

www.rebol.com/r3/docs/concepts/objects-self.html

If something is missing, let me know or edit it yourself.

Second, the #1549 description is not clear to me.

Brian Hawley
27-Apr-2010 22:59:25
Some quick notes:

"when used within the standard context of your code, it refers to your global context." - This means that when your code is in what you might think of as being a "script", rather than in a "module", self refers to system/contexts/user. You might want to mention scripts when you mention modules, because most make the distinction between the two and don't see how similar they are in R3.

It should be sufficient to change "Within an object! or module! the word self is pre-defined" to "Within an object!, module! or script the word self is pre-defined", and change the heading "Modules" to "Modules and Scripts". Maybe a few tweaks to the text of the section to mention scripts will help too. The point is to make it clear that you can't escape self just by not writing modules or objects.

While you're at it, change

probe words-of self
[system words-of]
to
probe words-of self
[system probe words-of]
Carl Sassenrath
27-Apr-2010 23:25:18
Here's the current validation test that will need to pass with all trues -- do you agree? What else should we add?

www.rebol.com/r3/self-test.r

Brian Hawley
27-Apr-2010 23:51:44
(For the purpoises of this discussion, I have been using self to refer to the keyword bound to the context, that doesn't actually exist in the context as a field. To refer to an actual self field I have been using 'self. Pardon the anthropomorphism of code herein.)

In the "Non-Object Contexts" section you don't make it clear whether self is omitted from the contexts, or simply not bound to the original code blocks that they were created to serve. This matters when the life of the contexts extends beyond their initial use.

The question on Ladislav's mind (as expressed in #1549) is how the those non-object contexts will behave beyond their initial use. In particular, in #1549 Ladislav is requesting that bind do the self keyword trick for contexts that originally were object-like, and that it not do the self keyword trick for contexts that originally were created for functions, closures, or by loops. It is assumed (I hope - he doesn't say so) that there will be some way to tell the difference between the two at runtime, and to express the distinction in mold, load and do syntax.

The basic premise of #1549 is that the creator of the context should be the one who decides whether self is bound later on, to code beyond the initial code that is associated with the context creation. It presupposes that objects (and modules and scripts) have a 'self field that refers to their context, and that contexts created for other purposes don't have that self field (sorry, we didn't know at the time that objects don't have a 'self field either in R3). Basically, he requests a return to R2's object model, or behavior that acts like it.

My counter-argument is that the creator of the context is not the one using the context later, and certainly not the one providing the code that the later binding will affect. That code will be written to depend on whether self will be bound, so the behavior of bind will need to be predictable. If that behavior is different depending on the original source of the context, then all attempts to bind code that depends on self being bound or not will need to be wrapped in conditional statements that check the context for that feature, and either trigger an error or convert the context to the other type. Or it will just be buggy, as R2's behavior with non-object contexts has been with most functions that expect object contexts.

Beyond that, the rest of the #1549 argument is Ladislav and I making argument and counter argument, and many associated tickets.

My counterproposal is #1544, which would make applying the self keyword an option at the time of application, rather than at the time of context creation. This would allow people who are writing the code to determine how their code is bound, rather than the creator of the context, who hasn't even seen your code and has no idea how you want to use self or 'self. Basically, I request that R3's current object model be cleaned up a bit, but otherwise left as is.

Brian Hawley
28-Apr-2010 0:30:34
Additions to the tests posted above...

Bug #1537:

--- "Type of self is object!"
ob: object []
print object! = do bind [type? self] ob
print object! = do bind [type? :self] ob
print object! = do bind [type? get 'self] ob
print object! = do bind [type? get /self] ob
print object! = do bind [type? get quote self:] ob

Will bug #1528 be fixed for closures? If so, add this to the "In functions, self can be an arg:" section:

print object? attempt [ob: object [f: closure [self][self]]]
print 1 = attempt [ob/f 1]

On the other hand, this line in the tests, getting rid of the append override of self with 'self, suggests that object contexts are not going to be able to have a 'self field, so perhaps closures won't be able to have 'self as a parameter:

print prot? [append object [] [self 1]]

You also need to check the indefinite-extent behavior of self binding for contexts created for function!, closure! and loop functions.

And please check out #1544 (or #1543, though I prefer the other) before you set everything in stone.

Brian Hawley
28-Apr-2010 1:49:12
Here is the difference between the proposals in test form.

Ticket #1544 (BIND /self option) - my favorite:

--- "Binding self explicitly later"
ob: object []
print not same? ob do bind [self] ob
print not same? ob do in ob [self]
print same? ob do bind/self [self] ob
ob: object [f: func [/x] [do bind/copy [self] 'x]]
print same? ob ob/f
ob: object [f: func [/x] [do bind/copy/self [self] 'x]]
print error? try [ob/f]
ob: object [w: repeat x 1 [bind? 'x]]
print 1 = ob/w/x
print not same? ob/w do bind [self] ob/w
print not same? ob/w do in ob/w [self]
print same? ob/w do bind/self [self] ob/w
ob: object [w: repeat self 1 [bind? 'self]]
print 1 = attempt [ob/w/self]
print not same? ob/w do bind [self] ob/w
print not same? ob/w do in ob/w [self]
print error? try [do bind/self [self] ob/w]

Ticket #1543 (BIND /no-self option):

--- "Binding self explicitly later"
ob: object []
print same? ob do bind [self] ob
print same? ob do in ob [self]
print not same? ob do bind/no-self [self] ob
ob: object [f: func [/x] [do bind/copy [self] 'x]]
print error? try [ob/f]
ob: object [f: func [/x] [do bind/copy/no-self [self] 'x]]
print same? ob ob/f
ob: object [w: repeat x 1 [bind? 'x]]
print 1 = ob/w/x
print same? ob/w do bind [self] ob/w
print same? ob/w do in ob/w [self]
print not same? ob/w do bind/no-self [self] ob/w
ob: object [w: repeat self 1 [bind? 'self]]
print 1 = attempt [ob/w/self]
print error? try [do bind [self] ob/w]
print error? try [do in ob/w [self]]
print not same? ob/w do bind/no-self [self] ob/w

Ticket #1549 (Ladislav's proposal):

--- "Binding self explicitly later"
ob: object []
print same? ob do bind [self] ob
print same? ob do in ob [self]
ob: object [f: func [/x] [do bind/copy [self] 'x]]
print same? ob ob/f
ob: object [w: repeat x 1 [bind? 'x]]
print 1 = ob/w/x
print not same? ob/w do bind [self] ob/w
print not same? ob/w do in ob/w [self]
ob: object [w: repeat self 1 [bind? 'self]]
print 1 = attempt [ob/w/self]
print not same? ob/w do bind [self] ob/w
print not same? ob/w do in ob/w [self]

Note that with Ladislav's proposal, you can't tell how bind will treat [self] without knowing where the context came from. This will lead to programmer errors, which aren't triggered with the #1549 proposal the way they are with the others. On the other hand, it will lead to less surprise for experienced R2 programmers (at the expense of everyone else).

Ladislav
28-Apr-2010 2:17:48
bind tests (aka bug #1549 tests):

--- "BIND works 'as expected' in object spec"
b1: [self]
ob: make object! [
    b2: [self]
    print same? first b2 first bind/copy b1 'b2
]

--- "BIND works 'as expected' in function body"
b1: [self]
f: func [/local b2] [
    b2: [self]
    print same? first b2 first bind/copy b1 'b2
]
f

--- "BIND works 'as expected' in closure body"
b1: [self]
f: closure [/local b2] [
    b2: [self]
    print same? first b2 first bind/copy b1 'b2
]
f

--- "BIND works 'as expected' in REPEAT body"
b1: [self]
repeat i 1 [
    b2: [self]
    print same? first b2 first bind/copy b1 'i
]
Ladislav
28-Apr-2010 2:39:50
Disclaimer: please, disregard any notes starting by words: "in Ladislav's mind", "Ladislav requests", etc. above. Thanks.
Brian Hawley
28-Apr-2010 4:02:59
Whoops, didn't finish simplifying the tests, at least as far as the loop tests are concerned. Replace the #1544 loop tests with these:
ob: repeat x 1 [bind? 'x]
print 1 = ob/x
print not same? ob do bind [self] ob
print not same? ob do in ob [self]
print same? ob do bind/self [self] ob
ob: repeat self 1 [bind? 'self]
print 1 = attempt [ob/self]
print not same? ob do bind [self] ob
print not same? ob do in ob [self]
print error? try [do bind/self [self] ob]
The rest can be simplified the same way.
Gabriele
28-Apr-2010 4:53:31
Brian, I don't see why you make it that complicated. I'll revise my two options this way, if you want:
  1. Always BIND 'self, warn the users about the surprises
  2. BIND 'self depending on context type; yes for object!, module!, script! (is it separate from module! ?), no for context!, closure!, function!.
Still no refinements for BIND needed, nor other special cases (I picture module! as being almost a "sub-type" of object! so there actually are only two types, the parent generic context! with no 'self, and the more specialized object! with 'self).

I guess the simplest way is (1), if you're willing to tell users that 'self is not the enclosing object but rather the "closest" enclosing context.

Brian Hawley
28-Apr-2010 5:47:18
Ladislav, thanks for posting those tests. They are semantically compatible with the tests and notes I posted earlier here for your #1549 proposal (including the ones you said to ignore), but better formatted. The closure clarification especially helps. And thanks for putting 'as expected' in quotes, since that really does depend on what you'd expect.

Sorry for the phrasing; it was a matter of timing - you were asleep, and someone familiar with your arguments needed to express them to Carl before he solidified things based on a misunderstanding. And you had explained your thoughts to me earlier, at length. Based on your tests, it appears that I did a good job explaining your proposal.

The reason I put the binding tests outside of the code block (even for the #1549 tests) was so that people wouldn't get confused about what was happening. If you put the explicit binding test inside the original code block of the originator of the context, it would make people think that the rules that apply to the original code block would necessarily apply to an explicit binding that coincidentally is inside the code block. That is similar to expecting a function to rebind self when its definition expression is inside an object spec, just because it's inside an object spec.

By moving the explicit binding statements outside of the original code block, it makes clear that these are separate processes. Even within the code block they are still separate processes, but that's not as obvious to the casual observer.

I had to put the explicit binding to the function context within the function code block though because the context is invalid after the function returns. This is not the case with closures.

I said necessarily above, because it is really a choice either way. There are valid arguments for both models.

Could you provide your ideas for how you would expect to create these selfless contexts with make syntax? That would be helpful here at this point.

Andreas
28-Apr-2010 6:21:06
"Could you provide your ideas for how you would expect to create these selfless contexts with make syntax? That would be helpful here at this point."

make context! []

Also, have the convenience wrapper context create a (selfless) context! and object a (selfish) object!.

Brian Hawley
28-Apr-2010 8:27:50
Gabriele, I wasn't making it more complex, I was just using more words to explain when the first explanation wasn't being understood. Or in the case of the tests, switching to a more precise language. It's a bad habit, sorry, one which I'm going to have to continue in this post.

Your description of the context situation is really close. Scripts are bound to object! contexts, and modules contain an inner reference to a pair of objects for their context and spec, so you were right on the money there. All types in the any-object! typeset have underlying object! contexts.

Functions have their own internal context type (frame!, afaict) that is stack-relative, task-safe (in theory), and invalid after the function returns. This means that bind? returns true for function contexts, not a context reference that would become invalid later. You can't get a direct reference to a function context in R3.

Closures currently generate object! contexts, but don't bind self in their code blocks - what the generated contexts do with self later is none of the closure's concern.

In theory you could add a new context! type that would act like an object! context, but bind would not do the self keyword trick with them, and 'self fields would be allowed. Loops could use the context! type, and maybe closures too if you insist on it. The context! type would not be able to be a part of any-object! though, because this makes it too hard to check for self support in screening code. And we would have to define compatibility between context! and the any-object! types in the situations where that may arise.

This third context type would satisfy your (2) proposal, but this is starting to get really complex. You are trading a native option for interpreted conditional code whenever you call bind. Your solution is actually more complex than mine, but you have moved the complexity out to the user code, where managing it is slower and more bug-prone.

Your (1) proposal is the simplest, but probably won't be accepted. There are already many tickets that make this clear: There are some situations where it is appropriate to bind self, and others where it is not. You can't really escape that.

Somewhere in between the simplicity of not offering a choice, and the complexity of working with context!, there is bind/self. Basically, in user code you would be replacing code of equivalent complexity to this:

either any-word? c [
    assert [true = bind? c]
] [
    assert/type [c context!]
]
bind [code] c
with this:
bind [code] c
And replacing this:
either any-word? c [
    bind [code] bind? c
    ; fails for function contexts, intentionally
] [
    assert/type [c any-object!]
    bind [code] c
]
with this:
bind/self [code] c
A little less complexity, and all of it hidden. All the errors triggered by the code above are done intentionally; when your code depends on self being bound, or not bound, it would be an error to bind the code otherwise, so you have to screen for it if that is possible.

For what it's worth, R3's bind has already had options added to support new capabilities - every one of these options has sped up R3 quite a bit (particularly with making and binding modules). I really miss R3's bind when using R2.

Ladislav
28-Apr-2010 10:32:36
BrianH wrote referring to my above proposed tests: "They are semantically compatible with the tests and notes I posted earlier here" - there is one important semantic difference: my tests do not "prescribe" any definite behaviour of the

do [self]

code in any particular "context". The only property my tests verify is the ability of bind to handle blocks as needed.

Moreover, there is yet another important property illustrated in the tests: namely the fact, that BIND may be needed during evaluation of the examined dialect code (object spec, func body, cycle body, ...), i.e. not "ex post", which was explicitly criticized by BrianH using it as a strawman case against my proposal.

Carl Sassenrath
28-Apr-2010 12:06:43
Thanks to all for your inputs and suggestions. Here's where we now stand:

  1. A first draft implementation is running, and I will release it very soon for your testing purposes.
  2. The terminology selfless and selfish is worthwhile. Objects and modules are selfish (contain an implicit 'self that cannot be modified.) Functions, closures, and loop contexts are selfless.
  3. Bind does not need a /self refinement because we know if a context is selfish or not.
  4. My tests and the above posted tests pass (minus one bug in my test and removing the bind/self related tests.)
Carl Sassenrath
28-Apr-2010 12:45:21
There is one small "glitch" that I noticed from the above tests:

print self = :self
false

That's because the word 'self is evaluated to be an object. Under the hood, it's actually a frame!

Surprisingly, fixing this at a fundamental level would be quite complicated and would either penalize runtime performance or memory storage. I'm not sure we want to do that. I'm leaving it as-is for now.

Brian Hawley
28-Apr-2010 15:10:21
"That's because the word 'self is evaluated to be an object. Under the hood, it's actually a frame!"

I'll dismiss bug#1537 then. It's no big deal if you don't need to support self and 'self with the same code. Feel free to not add that whole group of type tests.

Does this mean that the word/value mapping structure underlying objects and such is called a frame! in R3? If so, we should be using that term instead of "context", to avoid the confusion that arises when people realize that REBOL "contexts" aren't exactly contextual. (It happens all the time.)

"Bind does not need a /self refinement because we know if a context is selfish or not."

Bind knows whether a context is selfish. We don't know until after runtime testing to determine the context type, or data flow analysis to tell us where the context comes from. And that difference affects how the code block passed to bind needs to be written if it contains 'self references.

The key feature of the bind/self proposal is the errors triggered in the #1544 tests above. Those errors allow us to track down the behavioral bugs in our code caused by the difference between selfless and selfish contexts. And they require less (visible) runtime tests than the alternative, as shown in my last message.

Nonetheless, if you provide some equivalent to a selfless? function then the distinction wouldn't affect user code as much as the object/map distinction, so it won't be that bad. And it would be more compatible with R2 code, but better because it would be easier to test for selfless contexts than it is in R2 (or especially in R3 = a97).

Brian Hawley
28-Apr-2010 15:47:55
Ladislav, it wasn't a strawman argument. After a function or closure is defined, or a loop starts, or the code of an object or module starts running, any explicit calls to bind are "ex post". This is true whether those calls are made inside the initial code block or not.

"The only property my tests verify is the ability of bind to handle blocks as needed."

It's the "as needed" part that needed explanation. What, exactly, were the practical side effects of what was needed? Your phrasing of the tests do a good job of explaining your philosophical argument (a good thing) but the practical side effects needed more details. Such as what

do bind [self] c
means in different situations. That is the only reason for how my version of the tests was structured.

You are acting like your argument was under attack - this is not the case. I even explained your argument in code when the ticket description wasn't being understood, so that your argument would be more effective. As I did with the other arguments. It's only a matter of explaining all sides of the issue, so a decision can be made. And now it has been - good :)

Carl Sassenrath
28-Apr-2010 15:57:56
Well, now that the code is available to test, let's see what other feedback we get. If people want 'self in objects as a setable variable, there may be a way to make that happen. Keep in mind my primary motivation is to avoid allocating extra memory for the self word, because it's usage is rare. (I've used it maybe three or four times in 10 years.)

Regarding context, don't we use that term in a general way to mean the net effect of all name-value bindings for a specific sequence of words? I think this definition is as good as any. It's the environment. I'm not sure what you mean by "not being contextual" but the only formal comp-sci definition that I know of is that related to an OS process environment, e.g. a task's context. You might extend that to include true continuations (in the Scheme sense.)

WRT a programmer knowing if a context is selfish... it doesn't seem that critical (and that's a good thing.) But, perhaps we won't know for sure until we see more usage examples with the new A97 test core. For example, we want to be sure that code like Maxim's classes still works.

What I like about what we've got so far is that it's easy to remember and understand. If you use self, it's going to relate to your current object context, not to your functions or loops. I think that was the primary objective that developers wanted.

meijeru
28-Apr-2010 16:39:29
 repeat i 1 [probe self] 
still gives the loop context, not the outer one! This is the same as with A97.
Carl Sassenrath
28-Apr-2010 17:18:40
Meijeru: Interesting, but it's a bit strange, because:

ob: object [f: does [probe self]] ob/f

gives the correct object result. So, perhaps there's an odd problem in binding to the global module. I'll check it out, thanks for noting it.

Brian Hawley
28-Apr-2010 19:25:56
Meijeru's problem is bug#1529. It's not a global module binding thing, it's a loop thing: Loops are still binding 'self, even though they don't reserve it, same as a97.

For that matter, bug#447 has reverted in your test build so closures are binding 'self again too, and still not allowing 'self as a parameter (bug#1528). Functions are working well though in the test build (the rest of bug#1528).

The tests I posted to bug#1549 comments and R3 chat illustrate the problems to solve. Note that these are problems to solve in order to make the chosen object model work as you said it would. I'm not pushing for #1544 :)

I'll add more tests for construct and construct/with behavior. We need to decide on make and construct/with compatibility rules for selfless prototypes of selfish objects. I'm leaning towards making the result selfless if the proto has a 'self field, but otherwise selfish. It's not a problem to make a selfless from a selfish prototype.

Andreas
28-Apr-2010 19:43:12
Will we also get a way to make selfless contexts?
Brian Hawley
28-Apr-2010 20:37:46
"Regarding context, don't we use that term in a general way to mean the net effect of all name-value bindings for a specific sequence of words?"

That would make sense, yes. But no, in conversation in this community we generally use it as a euphemism for the collection of bindings associated with a single object!, function!, whatever. These bindings are applied at definition time, so they are contextual relative to the definition process at that point, but they are just direct-bound at runtime.

Let me explain what I mean by "not being contextual" (pardon the explanation of stuff you know - it's for the rest of the readers):

In many other languages the runtime context of a block of code is determinable at runtime: Variable lookup is performed at runtime relative to the currently running function with a fallback to the lexically surrounding function, or the calling function, depending on the language. So functions like what Ridcully proposed are possible. For others they are resolved at definition time, and the compiler resolves the "context" at compile time, so those functions can be faked by the compiler.

In REBOL, a code block has no single context (no direct reference to any collection of bindings to use by default), and is not contextually associated to any particular object or function. Instead, the contents of the block can have context, each element with its own contextual definition (binding).

This puts the self keyword into the "faked by the compiler" camp, but makes a real function that determines the runtime dynamic context (in the sense that we generally use the term) difficult to do. But not impossible, at least for functions, if you use R3's stack function with an option to return the frame! of the calling function (in theory: stack currently doesn't have such an option, or need it). Less possible for the "current" object context though, or loop contexts. And definitely impossible to determine the lexical context at runtime beyond the "current" function - that information is not kept even during definition time.

Does that make more sense? I would really prefer that we use the term "context" in the way you describe, and come up with a better official term for what we have been calling a "context". Perhaps "frame", or "environment", though we would still have to explain that they are resolved at definition time, mostly.

Carl Sassenrath
28-Apr-2010 21:09:43
www.rebol.com/r3/downloads/r3c2-a97.exe updated to fix Meijeru's bug above. Problem was in the binding mechanism, a difficult bug. (Also, if you have complicated code that has binding problems, EVOKE 3 to check the binding table.)

www.rebol.com/r3/self-test.r updated -- added construct and a few minor things, also appended BrianH's #1549 comment tests (to make sure we ran them.) All seem to pass.

Also, construct objects should be selfish, like all other objects. However, I noticed today that construct does not bind deep its own context. It needs definition.

If we are going to have selfless contexts, we need a strong justification of what they add. Keep in mind that self in contexts costs nothing.

Brian Hawley
28-Apr-2010 22:02:32
Andreas: "Will we also get a way to make selfless contexts?"

That sounds like a job for make context! [] (or whatever we call the type), or maybe construct/only. I hope so, since I can't write tests that check for compatibility between selfish and selfless contexts without being able to reliably create the persistent (non-function) selfless ones, or to tell the difference between them with that function Carl says isn't critical :)

As it is, there is no way in the current test build to mold a selfless context to a syntax that do can handle - this can't be fixed without changing the spec syntax for make object!, or calling selfless contexts something else. Nor can we detect selflessness without trying bind and testing the results. This all makes selfless contexts really annoying to use. I really hope this is fixed for a98.

Brian Hawley
29-Apr-2010 0:21:08
Latest test build, all tests pass. I even tested other loop functions just in case. These test sections are redundant and can be removed: "Binding self explicitly later" and "Brian's loop tests".

"If we are going to have selfless contexts, we need a strong justification of what they add. Keep in mind that self in contexts costs nothing."

As of your latest test build, we do have selfless contexts: Function contexts are selfless, as are the persistent contexts created by closures with or without a 'self parameter, and loops with or without a 'self loop variable.

In the case of selfless contexts created without a 'self field, the self keyword is not bound, but 'self still can't be added later with append (like any other object). If selfless contexts are created with a 'self field, then that field is bound, but not as the keyword, and without the restrictions; basically selfish, but with 'self overriden.

When persistent selfless contexts are used as the prototype of make object! or construct/with, the resulting object is selfish. If the field 'self is in the prototype, it overrides the self keyword.

This is all good enough for my tests, and Ladislav's.

The only things missing are do-friendly syntax for selfless contexts (#1552), and a function to determine whether a context is selfless. We could just ignore the former if necessary until the object spec syntax is changed (if ever). For the latter, the function will likely be called rarely enough that an ad-hoc mezzanine will do the job:

selfless?: func [
    "Returns true if context doesn't bind 'self."
    context [any-word! any-object!]
        "A reference to the target context"
    /local self
][
    self =? do bind/copy [self] context
]
That function works with the current test build, and could be added as a mezzanine. I made a ticket requesting it (#1584), for inclusion in a98.
Gabriele
29-Apr-2010 3:29:08
The terminology selfless and selfish is worthwhile. Objects and modules are selfish (contain an implicit 'self that cannot be modified.) Functions, closures, and loop contexts are selfless.

Carl, thanks! :)

Carl Sassenrath
29-Apr-2010 15:27:41
Brian, on selfless contexts, I understand what you're saying... but, what I'm asking for are some usage examples when a function or loop body will use self to refer to itself (rather than its superior context.)

Also, I'm not sure how safe it is to "export" some of these selfless contexts via make prototypes. For example, function words use relative stack bindings.

But, it's not really a question of "can it be done" it's more "will we ever do it?" Is this a usage pattern we need to support?

Andreas
29-Apr-2010 16:24:54
"If we are going to have selfless contexts, we need a strong justification of what they add."

"Is this a usage pattern we need to support?"

I think being able to implement (for example) new looping constructs at the mezzanine level is a usage pattern well worth supporting. Without the ability to create selfless contexts ourselves, we won't be able to make such looping constructs behave just as the built-in ones.

Brian Hawley
29-Apr-2010 21:03:11
"I know what I said before! Listen to what I'm saying now."
- The Incredibles

Carl, I am not advocating #1544 and bind/self - that issue has been resolved. I don't want functions and loops to bind self and am completely in favor of the behavior of the second test build linked above. I have no criticism of it, and that message was just for documentation purposes.

The current behavior doesn't propagate the selfless attribute to other objects when used as make prototypes. This is the preferred behavior. Everything works. We don't even want functions to be able to export their bindings; the treatment of 'self and bindings by functions is great right now.

You said "If we are going to have selfless contexts...". We already have selfless contexts (as of the second test build), and they behave exactly the way that people who want selfless contexts want them to behave. Even Ladislav and I agree that we have all gotten what we wanted. Consensus and victory for all has been achieved, hurrah!

All that I personally think is needed is that the bug#1584 request for a selfless? mezzanine function be added to alpha 98. The source is in the ticket, already tested, optimized and security-screened. This selfless? function will allow people who need to make the distinction between selfless and selfish contexts for whatever reason (usually security) to do so safely. Everyone else can ignore the function. And it's a one-line mezzanine, no big deal.

I really don't want you to change the behavior at all from the second test build. Nor does Ladislav, based on his comments in AltME. All of both our tests pass completely. We're even willing to let bug#1552 be dismissed or deferred.

Declare victory and move on! :)

Carl Sassenrath
30-Apr-2010 14:18:27
Brian: I'm glad to hear that. Hurray! I really appreciate the thought, proposals, and discussion you and Ladislav (and others in the community) have put into this issue. And, I think most other REBOL users also appreciate having this nailed down.

Andreas: In theory, that can be done with closures, but it would be good if someone provides a proof (or perhaps already has one and just needs to post it.)

Andreas
30-Apr-2010 18:08:07
Yes, it can be done via closure; for example:

to-selfless: funct [obj [object!]] [
  apply closure w: words-of obj [bind? first w] values-of obj
]

Only thing is, that I can't come up with a method to create an empty selfless context that way. Not sure how useful such an empty selfless context actually is, though.

Brian Hawley
1-May-2010 15:26:16
Andreas, you can create an empty selfless context by creating a function or closure with no parameters, but you can't get a reference to that context so it doesn't matter. And there's no legitimate reason to do so, because the limitation that append can't add 'self to a context also applies to selfless contexts. If 'self wasn't in the selfless context when it was created, you can't add it later.

The only reasons I can see for creating an empty selfless context is to add other fields to it later, or to use it to get references to otherwise inaccessible contexts by exploiting security holes in code bound to those contexts (self leakage).

Fortunately, such code can use selfless? to screen for that kind of thing :)

Post a Comment:

You can post a comment here. Keep it on-topic.

Name:

Blog id:

R3-0312


Comment:


 Note: HTML tags allowed for: b i u li ol ul font span div a p br pre tt blockquote
 
 

This is a technical blog related to the above topic. We reserve the right to remove comments that are off-topic, irrelevant links, advertisements, spams, personal attacks, politics, religion, etc.

REBOL 3.0
Updated 18-Apr-2024 - Edit - Copyright REBOL Technologies - REBOL.net