REBOL 3.0

Bitset improvements and Unicode support

Carl Sassenrath, CTO
REBOL Technologies
17-Jan-2008 20:57 GMT

Article #0111
Main page || Index || Prior Article [0110] || Next Article [0112] || 1 Comments || Send feedback

In R3 a number of changes have just been made to the BITSET! datatype (to appear in the next alpha release):

  • Bitsets directly support Unicode character values (codepoints).
  • Bitsets are variable length and auto-expand as necessary.
  • New functions (datatype actions) and arguments are supported. For example, you can now AND, OR, and XOR bitsets.

These changes were the result of the Unicodifying of R3. We will want code like this to work for any string, even Unicode:

white-space: make bitset! " ^-^/"
if find white-space a-char [...]

and:

where: find a-string white-space

In addition, we also want bitsets that work with Unicode to be efficient and use very little memory (even though Unicode spans a large range of possible characters). In the example above, the bitset only requires 10 bytes.

We also want bitsets to expand when needed. In R2, bitsets were fairly restrictive and errors would be thrown for actions that should, in theory, be valid.

So, these improvements have been made, and we may be making just a few more as well, once users get a chance to try it out.

See the R3 Documentation Home Page and click on the bitset link for more information.

1 Comments

REBOL 3.0
Updated 4-Oct-2024 - Edit - Copyright REBOL Technologies - REBOL.net