Exception proposals

From DocBase

(Difference between revisions)
Jump to: navigation, search
Revision as of 13:02, 13 November 2010
Ladislav (Talk | contribs)
(USE)
← Previous diff
Revision as of 13:03, 13 November 2010
Ladislav (Talk | contribs)
(The general cycle)
Next diff →
Line 241: Line 241:
body: head insert tail copy/deep body copy/deep inc body: head insert tail copy/deep body copy/deep inc
- use collect-words/set init reduce [+ let collect-words/set init #[unset!] reduce [
:do init :do init
:while test body :while test body

Revision as of 13:03, 13 November 2010

Contents

Introduction

REBOL exceptions have been discussed at:

The purpose of this article is to find a coherent proposal solving the above problems (and maybe more, with a bit of luck).

While I started the Exceptions article, due to the Brian's edits I was unable to continue. Therefore I decided to write my proposal in here, trying to achieve the following:

  • no intended introductions of bugs into the code somebody else wrote
  • no intended reformulation of somebody else's proposal, introduction of counterproposals is encouraged instead
  • to substantiate questioned claims, real code examples are preferred

As long as you are willing to keep these simple rules, you are welcome to add your proposals below.

Types of exceptions

The R3 Exception/Error Mechanism article mentions that there are two types of exceptions in REBOL: unwinds and throws. Since every exception type can be emulated using the other one, I decided to make a couple of speed tests to find out what the relative speed of the respective exception types is:

; the relative speed of throw is 100%
base-throw: time-block [try [none]] 0,05
base-unwind: round to percent! base-throw / time-block [catch [none]] 0,05
; == 106% 
error: make error! ""
handled-throw: time-block [try [do error]] 0,05
handled-unwind: round to percent! handled-throw / time-block [catch [throw none]] 0,05
; == 139%

This simple test seems to suggest, that unwinds are faster than throws, which implies, that it is preferable to replace throws by unwinds whenever possible (and it should be always possible, as noted above; this should be handled as an implementation detail, and should not "leak" into the user space anyway, correcting the corresponding unwind bugs mentioned in CureCode).

Dynamic versus definitional exceptions

Dynamic exceptions

Dynamic exceptions are caught by the (at run-time) closest (i.e. the last executed) catching construct.

Example using dynamic return:

c1: func [arg][do arg]
f1: does [print ["c1 returned:" c1 [return 1]] return 2]
print ["f1 returned:" f1]

Results are:

c1 returned: 1
f1 returned: 2

Notice, that the return 1 has been caught by the c1 function, since that was the closest (last executed) catching construct for it.

Definitional exceptions

Definitional exceptions are caught by the (at definition time) closest (i.e. the last defining) catching construct.

Example using definitional return:

c1: func [arg][do arg]
f1: does [print ["c1 returned:" c1 [return 1]] return 2]
print ["f1 returned:" f1]

Results are:

f1 returned: 1

Notice, that the return 1 has been caught by the f1 function, since that was the closest (last defining) catching construct for it.

Helper functions

SELFLESS

The selfless function creates a new selfless context containing the given words.

It is is working "as is" in the current interpreter version, and there is no reason why it might not work in all discussed interpreter variants.

selfless: funct [
    {Create a new selfless context.}
    words [block! any-word!] {Word(s) of the new context (modified)}
][
    either any-word? :words [word: :words][
        unless any-word? word: first :words [
            do make error! [
                Type: 'Script
                id: 'expect-arg
                arg1: 'selfless
                arg2: 'any-word
                arg3: type? :word
            ]
        ]
    ]
    word: to word! :word
    word: first foreach :word [#[unset!]] compose/deep [[(:word)]]
    bind/new :words :word
    bind? :word
]

RETURN and EXIT

Note: The code exit is equivalent to return #[unset!], therefore it is not necessary to discuss it separately. When mentioning "function using dynamic return", we usually mean "function using dynamic return and exit".

The current state (dynamic return)

As mentioned in the Errors article, REBOL currently uses just dynamically scoped return and exit, with no alternative behaviour. Problems with dynamic scoping occur when a function processes code block (or blocks) with return or exit coming from different contexts. In such cases, the behaviour is actually bug-prone, as demonstrated below.

USE

The use function is currently implemented as follows:

use: func [
    "Defines words local to a block."
    vars [block! word!] "Local word(s) to the block"
    body [block!] "Block to evaluate"
][
    apply make closure! reduce [to block! vars copy/deep body][]
]

Problems:

  • The use function and other control functions, like the collect function below are not supposed to catch return or exit from the body block. The place where they should return should be related to the "context of body origin" instead. Since that does not happen due to the fact, that the inner closure catches the dynamic return, the implementation is subject to bug #539.
  • The bug #539 is not correctible in the current interpreter version.
  • Since the anonymous inner closure is called just once, the copy/deep call deep copying the given body block is superfluous.
  • The implementation is more complicated than necessary, as demonstrated below.

COLLECT

The current implementation looks as follows:

collect: make function! [[
    {Evaluates a block, storing values via KEEP function, and returns block of collected values.}
    body [block!] "Block to evaluate"
    /into {Insert into a buffer instead (returns position after insert)}
    output [series!] "The buffer series (modified)"
][
    unless output [output: make block! 16]
    do func [keep] body func [value [any-type!] /only][
        output: apply :insert [output :value none none only]
        :value
    ]
    either into [output][head output]
]]

Problems:

  • The anonymous function inside the collect function body catches the dynamic return, causing the bug #539.
  • The bug #539 cannot be circumvented currently.
  • The implementation is more complicated than necessary, as demonstrated below.

Disadvantages

  • The above mentioned bug #539 is serious, not allowing any nonnative control function to work correctly. The problem is not solvable in this model.

The state of R2 (dynamic return with optional transparency)

This is equivalent to the R2's dynamic return enhanced by the [throw] function attribute. It is mentioned here for comparison purposes, not as a proposal. To respect Carl's wish to use a different (set-word based) method of specifying function attributes, I use the throw: method instead of the [throw] method used in R2.

LET

The let function creates a new selfless context containing the given vars, sets vars to values, binds (/copy) the given body block and evaluates it. (The implementation below is pretty much self-documenting.)

let: func [
    {Defines words local to a block.}
    vars [block! word!] "Local word(s) to the block"
    values [any-type!] "Value or block of values"
    body [block!] "Block to evaluate"
    throw:
    /local context
][
    context: selfless vars
    set/any vars values
    do bind/copy body context
]

Advantages of having such a function in REBOL:

  • let yields a result that is wanted, as opposed to the context function, which yields the object. This suggests, that let will be more popular than context.
  • let facilitates initialization, so it will be more popular than use.
  • let is shorter and simpler than the DO FUNC [variables] block idiom, so it will be more popular than that idiom as well.

USE

The use function can be implemented as follows:

use: func [
    {Defines words local to a block.}
    vars [block! word!] "Local word(s) to the block"
    body [block!] "Block to evaluate"
    throw:
][
    ; this initializes all VARS to #[unset!]
    let vars #[unset!] body

    ; if we wanted to initialize all VARS to #[none!]
    ; we could have used:
    ; let vars #[none!] body
]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

COLLECT

The collect function can be implemented as follows:

collect: make function! [[
    {Evaluates a block, storing values via KEEP function, and returns block of collected values.}
    body [block!] "Block to evaluate"
    /into {Insert into a buffer instead (returns position after insert)}
    output [series!] "The buffer series (modified)"
    throw:
][
    unless output [output: make block! 16]
    let 'keep func [value [any-type!] /only][
        output: apply :insert [output :value none none only]
        :value
    ] body
    either into [output][head output]
]]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

The general cycle

General iteration constructs can be used to express any standard sort of loop, as well as others -- such as looping over a number of collections in parallel. Where a more specific looping construct can be used, it is usually preferred over the general iteration construct, since it often makes the purpose of the expression more clear.

Implementation:

iterate: func [
    {General cycle control function}
    init [block!] {cycle initialization}
    test [block!] {test code}
    inc [block!] {incrementation code}
    body [block!] {cycle body}
    throw:
][
    ; to not modify the given arguments
    init: copy/deep init
    body: head insert tail copy/deep body copy/deep inc
	
    let collect-words/set init #[unset!] reduce [
        :do init
        :while test body
    ]
]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.

Example:

; this example doubles the cycle variable after every iteration
iterate [i: 1][i <= 2000][i: i + i][print i]

Definitional mezzanine CATCH/THROW pair

This is how a definitional catch/throw mezzanine pair can be implemented:

make object! [
    sys-catch: :catch
    sys-throw: :throw
    set 'catch func [
        {Catches a throw from a block and returns its value.}
        block [block!] "Block to evaluate"
        throw:
        /local result caught normal?
    ][
        do func [
            throw:
            /throw
        ] compose/deep/only [
            set/any 'caught sys-catch [
                throw: func [
                    {Throws control back to its catch.}
                    value [any-type!] "Value returned from catch"
                ][
                    sys-throw append/only tail (copy [throw]) get/any 'value
                ]
                set/any 'result do (block)
                normal?: true
            ]
            case [
                normal? [get/any 'result]
                all [
                    block? get/any 'caught
                    word? pick :caught 1
                    same? first :caught 'throw
                ][second :caught]
                'else [sys-throw get/any 'caught]
            ]
        ]
    ]
    unset 'throw
]

Notes:

  • Unhandled throw is localized well, since the 'throw word is reported unbound or unset by the interpreter in that case.
  • Definitional mezzanine continue can be implemented using a similar approach, see also the implementation below.

Advantages

  • Solves the most serious traditional problems (see use collect).
  • Most functions would not need to specify an option.
  • The dynamic return is implemented already.
  • Doesn't combine the transparency option with an option to specify a type for the return value, allowing them to be separate.
  • Easy to translate code from R2.

Disadvantages

  • Since the catching construct is not obvious from the source code (being determined at run time), code readability is worse than readability of the code using definitional return.
  • Bad locality for unhandled return, bug #1506.
  • Since the dynamic return is nonlocal, one may be tempted to catch unhandled returns. This may lead to an opposite extreme, catching some dynamic returns, that are, in fact, handled.
  • Does not address the need to have both a function scoped return as well as stay transparent for returns from different scopes at the same time. This is critical in R2, where errors must always be returned using return.
  • Some users have trouble understanding the dynamic return as a concept.
  • The transparency option would need to be specified for almost all control functions written in REBOL.
  • The transparency option is even harder to explain to users than the concept of dynamic return.
  • Some programmers use code blocks outside of functions e.g. in parse rules, putting even "unhandled" return or exit calls into them. The ability to write such "unhandled" code does not allow the reader to detect what is the intended context to return from, hurting the understandability and maintainability of such code.

Definitional return

The difference between the dynamic return and the definitional return is, that at function definition time, make would bind the 'return and 'exit words in function bodies to local versions of the return and exit functions. Those locally bound functions would only return to the function to which they are bound. The top-level versions of return and exit would just trigger an error, or not be defined at all.

There are suggestions, that the make function should be able to not bind the 'return and 'exit words in the function body if not wished. Such a feature is possible, and as opposed to the R2 [throw] function attribute, it would not influence the way how the functions work, influencing just the set of words the make function would bind when creating such a function. Nevertheless, the examples below (exploring the limits in a daring way) demonstrate, why such a feature is actually unnecessary.

USE

The use function can be implemented as follows:

use: func [
    {Defines words local to a block.}
    vars [block! word!] "Local word(s) to the block"
    body [block!] "Block to evaluate"
][
    ; this initializes all VARS to #[none!]
    do bind-to-new/copy body vars none

    ; if we wanted to initialize all VARS to #[unset!]
    ; we could have used:
    ; do bind-to-new/copy body vars #[unset!]
]

Notes:

  • The implementation does not need to use any function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

COLLECT

The collect function can be implemented as follows:

collect: make function! [[
    {Evaluates a block, storing values via KEEP function, and returns block of collected values.}
    body [block!] "Block to evaluate"
    /into {Insert into a buffer instead (returns position after insert)}
    output [series!] "The buffer series (modified)"
][
    unless output [output: make block! 16]
    do bind-to-new/copy body [keep] reduce [
        func [value [any-type!] /only][
            output: apply :insert [output :value none none only]
            :value
        ]
    ]
    either into [output][head output]
]]

Notes:

  • The implementation does not need to use function attribute.
  • The implementation is not a subject to bug #539.
  • The implementation is simpler than the current mezzanine.

GENERAL-CYCLE

General iteration constructs can be used to express any standard sort of loop, as well as others -- such as looping over a number of collections in parallel. Where a more specific looping construct can be used, it is usually preferred over the general iteration construct, since it often makes the purpose of the expression more clear.

Implementation:

iterate: func [
    {General cycle}
    init [block!] {cycle initialization}
    test [block!] {test code}
    inc [block!] {incrementation code}
    body [block!] {cycle body}
][
    ; to not modify the given arguments
    init: copy/deep init
    body: head insert tail copy/deep body copy/deep inc
	
    use collect-words/set init reduce [
        :do init
        :while test body
    ]
]

Notes:

  • The implementation uses the throw: function attribute.
  • The implementation is not a subject to bug #539.

Example:

; this example doubles the cycle variable after every iteration
iterate [i: 1][i <= 2000][i: i + i][print i]

Definitional mezzanine CATCH/THROW pair

Having definitional return in REBOL, the definitional catch/throw pair can be defined as a mezzanine as follows:

catch: func [
    {Catches a throw from a block and returns its value.}
    body [block!]
][
    ; create a new function to have a new definitional RETURN available
    ; use the new definitional RETURN as THROW in the BODY
    do does [do bind-to-new/copy body 'throw :return]
]

Notice, how the definitional return of the anonymous function is used as definitional throw.

A mezzanine definitional break looks to be implementable the same way.

Definitional mezzanine CONTINUE

This is not as perfect as the above definitional mezzanine catch/throw solution, but usable, when needed.

This version binds the given BLOCK just once, not at the start of every loop:

c-aware: func [
    {make a block definitional mezzanine CONTINUE-aware}
    block [block!]
    /local do-block do-block-body continue continue-body
][
    do-block-body: [exit block]
    do-block: make function! reduce [[block] do-block-body]
    
    ; the CONTINUE function has to exit from the DO-BLOCK function
    continue-body: [exit]
    continue: make function! reduce [[] continue-body]
    ; replacing the 'exit word by the one from the DO-BLOCK function body
    change continue-body first do-block-body
	
    ; the DO-BLOCK function has to do the BLOCK actually
    ; replacing the 'exit word by 'do
    change do-block-body 'do
    
    reduce [:do-block bind-to-new/copy block [continue] reduce [:continue]]
]

Usage:

for n 1 5 1 c-aware [
    if n < 3 [continue]
    print n
]

Advantages

  • Solves the control function problems without needing any function attribute.
  • The easiest to explain to people who are familiar with lexical scoping, but unfamiliar with dynamic return.
  • No options need to be explained either.
  • Satisfies the need to have both a function scoped return as well as pass through return from different contexts at the same time.
  • Can be used to implement the definitional catch/throw mezzanine pair.
  • Can be used (moderately) to implement the definitional mezzanine continue.
  • Can be used to implement the definitional mezzanine break.
  • Good locality for the unhandled return or exit errors (the cure for #1506 is already implemented, with no extra overhead), meaning, that it is easy to correctly catch unhandled definitional returns.

Disadvantages

  • If the code block using return is nested in the function body, return works automatically. If not, the user needs to either bind the code block, or get the correct return function otherwise (get/set word, etc.). Nevertheless, that is not hard to do, as can be seen in the mezzanine definitional catch/throw implementation.
  • Slight added overhead to function creation and memory use.

Hybrid return model

As stated in the Errors article, it is possible, that we will have a hybrid model allowing the existence of functions catching the dynamic return as well as the existence of functions transparent for the dynamic return using their own definitional return.

Advantages

  • All control function problems are solvable using the functions with the definitional return, i.e. the same way as in the "Definitional return" section.
  • Can implement definitional mezzanine catch/throw pair like above.
  • Solves the unhandled return error locality for definitional returns.

Disadvantages

  • Would still have the unhandled return error locality problem for dynamic return (if not corrected).
  • Would need to use an option to specify, that the function is using a definitional return (provided the dynamic return will be the default).
  • It is not obvious, whether a return or error in the code is definitional or dynamic, which will lead to errors (this can be cured by introducing different words for definitional return and exit, e.g. something like 'local-return and 'local-exit, minimizing possible confusion).
  • If we adopt the proposal in the Errors article to use return: as a conflated option to both specify definitional return and optionally a function return type, it will be impossible to specify a return type on a dynamic return function.
  • Slight added overhead to function creation and memory use for definitional return functions.

QUIT

  • The QUIT function is useful, allowing the user to finish the work of the interpreter.
  • The CATCH/quit function is useful for applications, which "need" to catch the QUIT not wanting (e.g the tested) code to escape from their control
  • Having the QUIT/now function is an error (see #1743), imitating just the state that existed before CATCH/quit was introduced. If we want to have that state, it suffices to undefine CATCH/quit.

HALT

Except for bugs (counting QUIT/NOW as one too), HALT is the only exception in REBOL able to cause the test environment crash. It is desirable to achieve a situation when all exceptions are catchable (this can be easily transformed to its opposite by udefining the respective catch function/functions), therefore, a corresponding catch function was proposed in #1742.

BREAK

Dynamic break

break is a dynamic exception currently. The /return refinement is used to force the respective loop to return a value. Since the exception is dynamic, the every problem of the dynamic return has an analogy in dynamic break, see the examples below.

DO-ALL control function

To illustrate that there is an analogy to the bug #539, we define this ("artificial") control function. Its purpose is to evaluate the subblocks of the given block. Implementation:

do-all: func [
    {evaluate the subblocks of the given BLOCK}
    block [block!]
][
    foreach subblock block [do subblock]
]
Analogy to #539
loop 1 [do-all [[break/return 1]] 2]
; == 2 (1 is expected)

Advantages

  • Already implemented.
  • The analogy to the bug #539 can be circumvented, since we already do have a (kind of) loop construct in REBOL, which is transparent for the dynamic break.
  • Does not require binding of the loop body.

Disadvantages

  • Analogy to the bug #539.
  • Bug #1506 and related problems with catching just the unhandled breaks, leaving the handled ones untouched (this one is more serious than the previous one, which, as mentioned, can be circumvented).

Definitional break

Advantages

  • No problems with an analogy to bug #539.
  • No problems with catching unhandled breaks.
  • A possibility to break from several nested loops using just one break call.

Disadvantages

  • Requires binding of the body block.

CONTINUE

This is formally very similar to break, so I am referring the reader to the section.

THROW

Dynamic throw

THROW is currently a dynamic exception. The /name refinement can be used to "individualize" throws.

Advantages

  • Already implemented.
  • Does not require binding the block, which may be considered "expensive", since the block is evaluated just once.

Disadvantages

  • While the /name refinement can be used to "individualize" throws, it most frequently remains unused, because a pair of catch/name block name and throw/name value name is too long to write compared to the proper individualization offered by the definitional throw.
  • As opposed to the definitional throw, the name word does not individualize the throw so, that the naming conflicts are precluded (see the bug #1744).
  • An analogy to #539 is lurking as well.
  • Analogies to all problems of dynamic exception handling mentioned above.

Definitional THROW

Advantages

  • Can be implemented as a mezzanine, having the definitional RETURN (see above).
  • No problems with an analogy to bug #539.
  • Free individualization (no need to pass any refinements/words).

Disadvantages

  • Requires binding of the body block.
Personal tools