REBOL 3.0

REBOL Discuss and Decide

Carl Carl Sassenrath, CTO
REBOL Technologies
4-Jul-2008 17:44 GMT

6934 visits since 12-Feb-2008
RSS Feed URL
Send me feedback

Search:

Purpose:
For issues and decisions related to R3 design and implementation.

  • Read this before posting.
  • Suggest a topic.

    Most Recent Comments:
    0003 0011 0007 0010 0006

  • Recent Articles:

    5-Mar-2008 - What does () mean? [0011] 6 Cmts
    17-Feb-2008 - Decided - MOLD char! [0010] 6 Cmts
    14-Feb-2008 - Decided - Unicode Console [0009] 3 Cmts
    13-Feb-2008 - Defining whitespace [0008] 10 Cmts
    13-Feb-2008 - Decided: No direct support for Win98 [0007] 25 Cmts
    13-Feb-2008 - Collation table and sorting [0006] 7 Cmts
    13-Feb-2008 - Loading Strings [0005] 12 Cmts
    12-Feb-2008 - Complementing virtual bitsets [0004] 7 Cmts
    12-Feb-2008 - Decided -- WORD datatypes within set functions [0003] 9 Cmts
    12-Feb-2008 - Suggested topics? [0002] 3 Cmts
    12-Feb-2008 - The purpose of this blog. [0001] 1 Cmts
    Contents - Index of all articles.

    5-Mar-2008 - What does () mean? [0011]

    In R2, () was an error.

    In R3, () returns the value: UNSET!

    So:

    >> mold ()
    == "unset!"
    

    It's debatable.

    We must admit that it does represent a valid expression. This can be seen in functions that allow missing arguments, such as:

    >> cd
    /C/rebol/3.0
    >> cd %..
    /C/rebol/
    

    which are valid anywhere if you use parens:

    >> (cd)
    /C/rebol/3.0
    >> (cd %..)
    /C/rebol/
    

    So, in the (cd) case, the missing argument for path is given the value UNSET!

    Mathematically, () is the trivial case. QED.

    Comments?

    6 Comments


    17-Feb-2008 - Decided - MOLD char! [0010]

    In R3 a char can be Unicode. For example, this is valid:

    c: #"^(1234)"
    

    The 1234 is a hex value for the character.

    This is also valid:

    c: #"***"
    

    Where the *** is a multi-byte UTF-8 encoded character.

    However, when we mold such a character, which of the above formats is best? Do we want the hex escaped value or the actual character?

    Decision

    By default, the character will be in its native encoded format. In other words, a Greek alpha will appear as #"α".

    There will be an option added to force MOLD to output the ^(1234) escaped format when needed.

    6 Comments


    14-Feb-2008 - Decided - Unicode Console [0009]

    On Windows, R3 currently uses the default console for text I/O. Eventually, we will add our own console, such as in R2. However, for now, we do not want to spend time on that, but on core functions.

    The issue is: what mode do we want to use for the console and how is it best initialized.

    IMO, there are two choices:

    1. Set it up for UTF-8 input/output. That is, keyboard input is sent to the R3 console device as bytes in UTF-8 encoded format.
    2. If that is not supported by MS, then UTF-16.
    3. If that is not supported by MS, then raw Unicode codepoints as wide chars is ok.
    Decision

    We will use the wide-char console mode. This will make the console work properly for Unicode, but will require that we make special changes to support stdio redirection.

    3 Comments


    13-Feb-2008 - Defining whitespace [0008]

    It's funny the little things that need to be properly defined. For example...

    There are a few internal functions that know about whitespace. For example, the PARSE function by default will deal with whitespace.

    The Question:

    What do we mean by whitespace?

    Internally in R3, there are 2 definitions:

    • SPACE: #" " + #"^-"
    • WHITESPACE: SPACE + #"^/" + #"^m"

    However, for such functions, do we want to consider other control chars as whitespace? For example, is backspace whitespace?

    In R2, many control chars were treated as whitespace, but I am not so sure we want to do that in R3.

    Your opinion?

    10 Comments

    View index of all articles...

    REBOL 3.0
    Updated 4-Jul-2008 - Edit - Copyright REBOL Technologies - REBOL.net