Defining binary datatype insert actions

Carl Sassenrath, CTO
REBOL Technologies
4-Mar-2008 23:58 GMT

Article #0118
Main page || Index || Prior Article [0117] || Next Article [0119] || 12 Comments || Send feedback

First, keep in mind that bytes are not characters in R3. Characters may hold Unicode values, which may require more than one byte, and the form of those bytes depends on the Unicode encoding used.

Now, what does it mean to INSERT or APPEND to a binary value? For example:

bin: make binary! 100
append bin 123
append bin 12:30
append bin "test"
append bin

In R2, BINARY! was defined as raw data that for actions such as those above, defaulted to be text (meaning that CR LF line terminators were as-is, not converted to just an LF, as is REBOL's convention.) Because of that, we defined things like inserting an integer to mean FORM the integer first, then insert those characters into the binary.

In R3, BINARY! is defined as an encoded set of bytes. The encoding depends on what the binary is. For example, text can be encoded in many ways, such as UTF8, UTF16(LE), UTF16(BE), and even Latin1 or other page encodings. If we insert an integer, 123, what does it mean? Do we want to FORM it and insert it as UTF8 as the default? Or do we just insert the byte whose value is 123?

So, right now in R3, the action:

append bin "text"

is ambiguous. Does it mean to append the single bytes for "text" or does it mean use UTF8 encoding, or does it mean something else?

We need to define it. We can pick any meaning we want, but it should be clearly stated. It can even be an error.

So, think about it and post your ideas.


Updated 22-Jun-2024 - Edit - Copyright REBOL Technologies -