Exception proposals

From DocBase

(Difference between revisions)
Jump to: navigation, search
Revision as of 16:24, 11 November 2010
Ladislav (Talk | contribs)
(Definitional CATCH/THROW mezzanine pair)
← Previous diff
Revision as of 18:01, 11 November 2010
Ladislav (Talk | contribs)
(Definitional mezzanine CONTINUE)
Next diff →
Line 365: Line 365:
====Definitional mezzanine CONTINUE==== ====Definitional mezzanine CONTINUE====
-This is not as perfect as the above definitional mezzanine '''catch'''/'''throw''' solution, but usable, when needed:+This is not as perfect as the above definitional mezzanine '''catch'''/'''throw''' solution, but usable, when needed.
 + 
 +This version tries to bind the given BLOCK just once, which can be done, as demonstrated.
<pre>c-aware: func [ <pre>c-aware: func [
{make a block definitional mezzanine CONTINUE-aware} {make a block definitional mezzanine CONTINUE-aware}
block [block!] block [block!]
 + /local do-block do-block-body continue continue-body
] [ ] [
- reduce [+ do-block-body: [exit block]
- func [block] [+ do-block: make function! reduce [[block] do-block-body]
- do bind-to-new block [continue] reduce [:exit]+
- ] block+ ; the CONTINUE function has to exit from the DO-BLOCK function
- ]+ continue-body: [exit]
 + continue: make function! reduce [[] continue-body]
 + ; replacing the 'exit word by the one from the DO-BLOCK function body
 + change continue-body first do-block-body
 +
 + ; the DO-BLOCK function has to do the BLOCK actually
 + ; replacing the 'exit word by 'do
 + change do-block-body 'do
 +
 + reduce [:do-block bind-to-new block [continue] reduce [:continue]]
]</pre> ]</pre>
Usage: Usage:

Revision as of 18:01, 11 November 2010

Contents

Introduction

REBOL exceptions have been discussed at:

The purpose of this article is to find a coherent proposal solving the above problems (and maybe more, with a bit of luck).

While I started the Exceptions article, due to the Brian's edits I was unable to continue. Therefore I decided to write my proposal in here, trying to achieve the following:

  • no intended introductions of bugs into the code somebody else wrote
  • no intended reformulation of somebody else's proposal, introduction of counterproposals is encouraged instead
  • to substantiate questioned claims, real code examples are preferred

As long as you are willing to keep these simple rules, you are welcome to add your proposals below.

Types of exceptions

The R3 Exception/Error Mechanism article mentions that there are two types of exceptions in REBOL: unwinds and throws. Since every exception type can be emulated using the other one, I decided to make a couple of speed tests to find out what the relative speed of the respective exception types is:

; the relative speed of throw is 100%
base-throw: time-block [try [none]] 0,05
base-unwind: round to percent! base-throw / time-block [catch [none]] 0,05
; == 106% 
error: make error! ""
handled-throw: time-block [try [do error]] 0,05
handled-unwind: round to percent! handled-throw / time-block [catch [throw none]] 0,05
; == 139%

This simple test seems to suggest, that unwinds are faster than throws, which implies, that it is preferable to replace throws by unwinds whenever possible (and it should be always possible, as noted above; this should be handled as an implementation detail, and should not "leak" into the user space anyway, correcting the corresponding unwind bugs mentioned in CureCode).

Helper functions

BIND-TO-NEW

The bind-to-new function binds the given block to a newly created selfless context.

It is is working "as is" in the current interpreter version, and there is no reason why it might not work in all discussed interpreter variants.

The current result version is a result of cooperation with Brian, who tested it and corrected my original bugs.

bind-to-new: funct [
    {Bind the given block or word to a new selfless context.}
    block [block! any-word!] {Block or word to bind (copied)}
    words [block! any-word!] {Word(s) of the new context}
    value [any-type!] {Initial value or block of values of the words}
][
    ; Convert words to something FOREACH will handle
    words: either any-word? words [reduce [to word! words]] [
        map-each x words [either any-word? :x [to word! x] [:x]]
        ; If words is empty or contains non-words then
        ; FOREACH will trigger a 'script 'invalid-arg error. 
    ]
    unless block? :value [ ; Create a block filled with references to value.
        value: append/only/dup make block! len: length? words :value len
    ]
    ; Each word is set/any the value at that position in the value block.
    ; If not enough values then the extra words are set to none.
    ; If too many values then the extra values will be ignored.
    ; Duplicate words of any type are set to the last associated value.
    foreach (words) value reduce [:return :quote :block]
]

RETURN and EXIT

Note: The code exit is equivalent to return #[unset!], therefore it is not necessary to discuss it separately. When mentioning "function using dynamic return", we usually mean "function using dynamic return and exit".

The current state (dynamic return)

As mentioned in the Errors article, REBOL currently uses just dynamically scoped return and exit, with no alternative behaviour. Problems with dynamic scoping occur when a function processes code block (or blocks) with return or exit coming from different contexts. In such cases, the behaviour is actually bug-prone, as demonstrated below.

USE

The use function is currently implemented as follows:

use: func [
    "Defines words local to a block."
    vars [block! word!] "Local word(s) to the block"
    body [block!] "Block to evaluate"
] [
    apply make closure! reduce [to block! vars copy/deep body] []
]

Problems:

  • The use function and other control functions, like the collect function below are not supposed to catch return or exit from the body block. The place where they should return should be related to the "context of body origin" instead. Since that does not happen due to the fact, that the inner closure catches the dynamic return, the implementation is subject to bug #539.
  • The bug #539 is not correctible in the current interpreter versions, that is why this document exists.
  • Since the anonymous inner closure is called just once, the copy/deep call deep copying the given body block is superfluous.
  • The implementation is more complicated than necessary, as demonstrated below.

COLLECT

The current implementation looks as follows:

collect: make function! [[
    {Evaluates a block, storing values via KEEP function, and returns block of collected values.}
    body [block!] "Block to evaluate"
    /into {Insert into a buffer instead (returns position after insert)}
    output [series!] "The buffer series (modified)"
][
    unless output [output: make block! 16]
    do func [keep] body func [value [any-type!] /only] [
        output: apply :insert [output :value none none only]
        :value
    ]
    either into [output] [head output]
]]

Problems:

  • The anonymous function inside the collect function body catches the dynamic return, causing the bug #539.
  • The bug #539 cannot be circumvented currently.
  • The implementation is more complicated than necessary, as demonstrated below.

Disadvantages

  • The above mentioned bug #539 is serious, and it is not solvable in this model.

The state of R2 (dynamic return with optional transparency)

This is equivalent to the R2's dynamic return enhanced by the [throw] function attribute. It is mentioned here for comparison purposes, not as a proposal. To respect Carl's wish to use a different (set-word based) method of specifying function attributes, I use the throw: method instead of the [throw] method used in R2.

USE

The use function can be implemented as follows:

use: func [
    {Defines words local to a block.}
    vars [block! word!] "Local word(s) to the block"
    body [block!] "Block to evaluate"
    throw:
] [
    ; this initializes all VARS to #[none!]
    do bind-to-new body vars [#[none!]]

    ; if we wanted to initialize all VARS to #[unset!]
    ; we could have used:
    ; do bind-to-new body vars head insert/dup copy [] #[unset!] length? vars
]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

COLLECT

The collect function can be implemented as follows:

collect: make function! [[
    {Evaluates a block, storing values via KEEP function, and returns block of collected values.}
    body [block!] "Block to evaluate"
    /into {Insert into a buffer instead (returns position after insert)}
    output [series!] "The buffer series (modified)"
    throw:
][
    unless output [output: make block! 16]
    do bind-to-new body [keep] reduce [
        func [value [any-type!] /only] [
            output: apply :insert [output :value none none only]
            :value
        ]
    ]
    either into [output] [head output]
]]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

GENERAL-CYCLE

General iteration constructs can be used to express any standard sort of loop, as well as others -- such as looping over a number of collections in parallel. Where a more specific looping construct can be used, it is usually preferred over the general iteration construct, since it often makes the purpose of the expression more clear.

Implementation:

general-cycle: func [
    {General cycle control function}
    init [block!] {cycle initialization}
    test [block!] {test code}
    inc [block!] {incrementation code}
    body [block!] {cycle body}
    throw:
] [
    ; to not modify the given arguments
    init: copy/deep init
    body: head insert tail copy/deep body copy/deep inc
	
    use collect-words/set init reduce [
        :do init
        :while test body
    ]
]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.

Example:

; this example doubles the cycle variable after every iteration
general-cycle [i: 1] [i <= 2000] [i: i + i] [print i]

Definitional CATCH/THROW mezzanine pair

This is how a definitional catch/throw mezzanine pair (this one without error localization, but the error localization can be added as well) can be implemented:

make object! [
    sys-catch: :catch
    sys-throw: :throw
    set 'catch func [
        {Catches a throw from a block and returns its value.}
        block [block!] "Block to evaluate"
        throw:
        /local result caught normal?
    ] [
        use [throw] compose/deep/only [
            set/any 'caught sys-catch [
                throw: func [
                    {Throws control back to its catch.}
                    value [any-type!] "Value returned from catch"
                ] [
                    sys-throw append/only tail (copy [throw]) get/any 'value
                ]
                set/any 'result do (block)
                normal?: true
            ]
            case [
                normal? [get/any 'result]
                all [
                    block? get/any 'caught
                    word? pick :caught 1
                    same? first :caught 'throw
                ] [second :caught]
                'else [sys-throw get/any 'caught]
            ]
        ]
    ]
]

Advantages

  • Solves the most serious traditional problems (see use collect).
  • Most functions would not need to specify an option.
  • The dynamic return is implemented already.
  • Doesn't combine the transparency option with an option to specify a type for the return value, allowing them to be separate.
  • Easy to translate code from R2.

Disadvantages

  • Bad locality for unhandled return, bug #1506 (at least currently).
  • Since the dynamic return is nonlocal, one may need to catch unhandled returns. On the other hand, this may lead to catching some dynamic returns, that are, in fact, handled.
  • Does not address the need to have both a function scoped return as well as stay transparent for returns from different contexts at the same time. This is critical especially in R2, where errors must always be returned using return.
  • Some users have trouble understanding the dynamic return as a concept.
  • Some users question the dynamic return. They find the definitional variant more comfortable, as well as more understandable.
  • The transparency option would need to be specified for almost all control functions written in REBOL.
  • The transparency option is even harder to explain to users than the concept of dynamic return.
  • Below you can find code that cannot be implemented using the dynamic return.
  • Some programmers use code blocks outside of functions e.g. in parse rules, putting even "unhandled" return or exit calls into them. The ability to write such "unhandled" code does not allow the reader to detect what is the intended context to return from, hurting the understandability and maintainability of such code.

Definitional return

The difference between the dynamic return and the definitional return is, that at function definition time, make would bind the 'return and 'exit words in function bodies to local versions of the return and exit functions. Those locally bound functions would only return to the function to which they are bound. The top-level versions of return and exit would just trigger an error, or not be defined at all.

There are suggestions, that the make function should be able to not bind the 'return and 'exit words in the function body if not wished. Such a feature is possible, and as opposed to the R2 [throw] function attribute, it would not influence the way how the functions work, influencing just the set of words the make function would bind when creating such a function. Nevertheless, the examples below (exploring the limits in a daring way) demonstrate, why such a feature is actually unnecessary.

USE

The use function can be implemented as follows:

use: func [
    {Defines words local to a block.}
    vars [block! word!] "Local word(s) to the block"
    body [block!] "Block to evaluate"
] [
    ; this initializes all VARS to #[none!]
    do bind-to-new body vars [#[none!]]

    ; if we wanted to initialize all VARS to #[unset!]
    ; we could have used:
    ; do bind-to-new body vars head insert/dup copy [] #[unset!] length? vars
]

Notes:

  • The implementation does not need to use any function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

COLLECT

The collect function can be implemented as follows:

collect: make function! [[
    {Evaluates a block, storing values via KEEP function, and returns block of collected values.}
    body [block!] "Block to evaluate"
    /into {Insert into a buffer instead (returns position after insert)}
    output [series!] "The buffer series (modified)"
][
    unless output [output: make block! 16]
    do bind-to-new body [keep] reduce [
        func [value [any-type!] /only] [
            output: apply :insert [output :value none none only]
            :value
        ]
    ]
    either into [output] [head output]
]]

Notes:

  • The implementation does not need to use function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

GENERAL-CYCLE

General iteration constructs can be used to express any standard sort of loop, as well as others -- such as looping over a number of collections in parallel. Where a more specific looping construct can be used, it is usually preferred over the general iteration construct, since it often makes the purpose of the expression more clear.

Implementation:

general-cycle: func [
    {General cycle}
    init [block!] {cycle initialization}
    test [block!] {test code}
    inc [block!] {incrementation code}
    body [block!] {cycle body}
] [
    ; to not modify the given arguments
    init: copy/deep init
    body: head insert tail copy/deep body copy/deep inc
	
    use collect-words/set init reduce [
        :do init
        :while test body
    ]
]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.

Example:

; this example doubles the cycle variable after every iteration
general-cycle [i: 1] [i <= 2000] [i: i + i] [print i]

Definitional CATCH/THROW mezzanine pair

Having definitional return in REBOL, the definitional catch/throw pair can be defined as a mezzanine as follows:

catch: func [
    {Catches a throw from a block and returns its value.}
    body [block!]
] [
    ; create a new function to have a new definitional RETURN available
    ; use the new definitional RETURN as THROW in the BODY
    do func [] [do foreach throw reduce [:return] reduce [body]]
]

Notice, how the definitional return of the anonymous function is used as definitional throw.

A mezzanine definitional break looks to be implementable the same way.

Definitional mezzanine CONTINUE

This is not as perfect as the above definitional mezzanine catch/throw solution, but usable, when needed.

This version tries to bind the given BLOCK just once, which can be done, as demonstrated.

c-aware: func [
    {make a block definitional mezzanine CONTINUE-aware}
    block [block!]
    /local do-block do-block-body continue continue-body
] [
    do-block-body: [exit block]
    do-block: make function! reduce [[block] do-block-body]
    
    ; the CONTINUE function has to exit from the DO-BLOCK function
    continue-body: [exit]
    continue: make function! reduce [[] continue-body]
    ; replacing the 'exit word by the one from the DO-BLOCK function body
    change continue-body first do-block-body
	
    ; the DO-BLOCK function has to do the BLOCK actually
    ; replacing the 'exit word by 'do
    change do-block-body 'do
    
    reduce [:do-block bind-to-new block [continue] reduce [:continue]]
]

Usage:

for n 1 5 1 c-aware [
    if n < 3 [continue] print n
]

Advantages

  • Solves the control function problems without needing any function attribute.
  • The easiest to explain to people who are familiar with lexical scoping, but unfamiliar with dynamic return.
  • No options need to be explained either.
  • Satisfies the need to have both a function scoped return as well as pass through return from different contexts at the same time.
  • Can be used to implement the definitional catch/throw mezzanine pair.
  • Can be used (moderately) to implement the definitional mezzanine continue.
  • Can be used to implement the definitional mezzanine break.
  • Good locality for the unhandled return or exit errors (the cure for #1506 is already implemented, with no extra overhead), meaning, that it is easy to correctly catch unhandled definitional returns.

Disadvantages

  • If the code block using return is nested in the function body, return works automatically. If not, the user needs to either bind the code block, or get the correct return function otherwise (get/set word, etc.). Nevertheless, that is not hard to do, as can be seen in the mezzanine definitional catch/throw implementation.
  • Slight added overhead to function creation and memory use.

Hybrid return model

As stated in the Errors article, it is possible, that we will have a hybrid model allowing the existence of functions catching the dynamic return as well as the existence of functions transparent for the dynamic return using their own definitional return.

Advantages

  • All control function problems are solvable using the functions with the definitional return, i.e. the same way as in the "Definitional return" section.
  • Can implement definitional mezzanine catch/throw pair like above.
  • Solves the unhandled return error locality for definitional returns.

Disadvantages

  • Would still have the unhandled return error locality problem for dynamic return (if not corrected).
  • Would need to use an option to specify, that the function is using a definitional return (provided the dynamic return will be the default).
  • It is not obvious, whether a return or error in the code is definitional or dynamic, which will lead to errors (this can be cured by introducing different words for definitional return and exit, e.g. something like 'local-return and 'local-exit, minimizing possible confusion).
  • If we adopt the proposal in the Errors article to use return: as a conflated option to both specify definitional return and optionally a function return type, it will be impossible to specify a return type on a dynamic return function.
  • Slight added overhead to function creation and memory use for definitional return functions.

QUIT

  • The QUIT function is useful, allowing the user to finish the work of the interpreter.
  • The CATCH/quit function is useful for applications, which "need" to catch the QUIT not wanting (e.g the tested) code to escape from their control
  • Having the QUIT/now function is an error (see #1743), imitating just the state that existed before CATCH/quit was introduced. If we want to have that state, it suffices to undefine CATCH/quit.

HALT

Except for bugs (counting QUIT/NOW as one too), HALT is the only exception in REBOL able to cause the test environment crash. It is desirable to achieve a situation when all exceptions are catchable (this can be easily transformed to its opposite by udefining the respective catch function/functions), therefore, a corresponding catch function was proposed in #1742.

BREAK

Dynamic break

break is a dynamic exception currently. The /return refinement is used to force the respective loop to return a value. Since the exception is dynamic, the every problem of the dynamic return has an analogy in dynamic break, see the examples below.

DO-ALL control function

To illustrate that there is an analogy to the bug #539, we define this ("artificial") control function. Its purpose is to evaluate the subblocks of the given block. Implementation:

do-all: func [
    {evaluate the subblocks of the given BLOCK}
    block [block!]
] [
    foreach subblock block [do subblock]
]
Analogy to #539
loop 1 [do-all [[break/return 1]] 2]
; == 2 (1 is expected)

Advantages

  • Already implemented.
  • The analogy to the bug #539 can be circumvented, since we already do have a (kind of) loop construct in REBOL, which is transparent for the dynamic break.

Disadvantages

  • Analogy to the bug #539.
  • Bug #1506 and related problems with catching just the unhandled breaks, leaving the handled ones untouched (this one is more serious than the previous one, which, as mentioned, can be circumvented).

Definitional break

Advantages

  • No problems with an analogy to bug #539.
  • No problems with catching unhandled breaks.
  • A possibility to break from several nested loops using just one break call.

CONTINUE

This is formally very similar to break, so I am referring the reader to the section.

THROW

Dynamic throw

THROW is currently a dynamic exception. The /name refinement can be used to "individualize" throws.

Advantages

  • Already implemented.
  • To individualize.

Disadvantages

  • While the /name refinement can be used to "individualize" throws, it most frequently remains unused, because a pair of catch/name block name and throw/name value name is too long to write compared to the proper individualization offered by the definitional throw.
  • As opposed to the definitional throw, the name word does not individualize the throw so, that the naming conflicts are precluded (see the bug #1744).
  • An analogy to #539 is lurking as well.
  • Analogies to all problems of dynamic exception handling mentioned above.

Definitional THROW

Advantages

  • Can be implemented as a mezzanine, having the definitional RETURN (see above).
  • No problems with an analogy to bug #539.
  • Free individualization (no need to pass any refinements/words).
Personal tools