Comments on: Accept all characters in direct lexical form?

Here is a simple question, but perhaps not a simple answer:

Do we allow non-printing characters within a character literal?

For example, we now accept chars like:

>> to integer! #"^J"
== 10

But, what if the ^J is actually a char code 10?

Currently, R3 allows that. However, CureCode #1232 reports this feature as a bug.

Arguments can be made both ways, for and against.

If we decide not to allow it, then we should not allow any such literals below U+0020. (As for possible unprintable literals above that, I think we need to accept them all.)

Not that this does not affect how strings and files deal with such characters. We're only talking about the char literal format here.

7 Comments

Comments:

Brian Hawley
5-Oct-2009 23:47:44 Technically, bug#1232 reports that only newlines are a problem, because platform-specific line endings tend to mess up the literal value, particularly on Windows.
Aside from that, I'll stay neutral on this one.
Mark Ingram
6-Oct-2009 13:07:55 Until I see a believable use case, I'd say no. There's already enough confusion with the different methods of escaping chars in strings (probably both kinds) and files.
As an example of the kinds of problems you will have to watch out for, Rebol 2 has been out for years now and to my knowledge no-one has discovered that :mold screws file names that contain a literal '%':
>> do mold %"per%cent" == %perÎnt
Maxim Olivier-Adlhoch
6-Oct-2009 23:09:58 woudn't this break low-level keyboard event handling?
I mean at the low-level view platform level, not the vanilla'd VID.
Maxim Olivier-Adlhoch
6-Oct-2009 23:12:11 failed to specify...
woudn't *Preventing* these chars mangle the keyboard events handling?
-ignore this line... the blog thinks I'm an evil spam bot-
Brian Hawley
7-Oct-2009 2:03:13 Maxim, this restriction doesn't prevent char! values below space from existing, it just prevents them from being specified literally in REBOL char! syntax. The characters still exist, they would just have to be typed out in escaped form when you are writing them in REBOL syntax.
Mark, that behavior in REBOL can be a little surprising, true, and certainly needs to be mentioned in the manual page for file!. However, we are talking about requiring characters that you can't see be escaped using an existing escaping method that already works, not adding a new escaping method. Not quite the same thing.
Oldes
8-Oct-2009 16:05:41 I would remove it. Look how it's strange:
>> same? #"^/" #"^J" == true

I think using this method is enough:
>> to-integer #"^(0A)" == 10

But it's true, that I use quite often notations like #"^/" or #"^-" So I don't know.
Ladislav
9-Oct-2009 4:39:25 Oldes, you are off by a mile, none of the representations you used is the one in question.
I will not feel harmed, if the non-printable characters will be disallowed within a character literal, since I never dared to use them in such a situation.
Whether the non-printable characters should be disallowed? - I am strictly neutral.

Comments on: Accept all characters in direct lexical form?

Comments:

Post a Comment: