Important decision: is binary a string?

Carl Sassenrath, CTO
REBOL Technologies
20-Jun-2009 19:51 GMT

Article #0209
Main page || Index || Prior Article [0208] || Next Article [0210] || 21 Comments || Send feedback

It's time to make an important decision: Is a binary sequence of bytes a string?

In R2 it was. But, in R3, we need to discuss it and reach a decision together.


As you know, R3 adds Unicode. A string! (datatype) is defined as sequence of Unicode code-points (just think "chars" for this discussion). Its meaning is "text".

A binary! datatype is a sequence of bytes. It may or may not be text. It could be an image, a sound, machine code, or whatever. So, binary is quite often an encoded datatype. It must be decoded to be useful.

REBOL defines any-string! to include all datatypes that act like strings. But, how do we define string?


R3 requires a more precise definition of "string". Specifically, we need to decide if any-string! includes binary! This would affect a number of functions.

Update: The Decision

The decision has been made and implemented in the newer releases of R3: binary is no longer part of the string "superclass".

In R3, binary plays a different role, and we think this is an important distinction. It also makes many of the functions that deal with binary cleaner.

For example, the relationship between binary and integers becomes really obvious:

>> append b: #{0504} 3
== #{050403}
>> pick b 2
== 4

This was not true in R2, because binaries were used often for character strings, so the above append had to insert #33 the ASCII code for 3. In R3, binary is binary. Just a sequence of bytes. No encoding is to be assumed.

Read the comments for the full discussion.


Updated 15-Jun-2024 - Edit - Copyright REBOL Technologies -