Comments on: More about the VECTOR datatype

The main purpose of the vector datatype is to provide more efficient integer and decimal array storage.

For example, if you need to index a large file and keep track of 100000 index positions, you can do so with the block:

idxs: make block! 100000

or, you can use:

idxs: make vector! 100000

The benefit of the vector is the memory it saves. Of course, the downside is that all its elements must be of the same type and size (homogeneous, e.g. 32 bit integers). In the example above, the vector array would save 1'200'000 bytes of memory. So, that is a win in many such cases.

The above make creates default elements as 32 bit signed integers. However, you can specify the bit-size of the vector element. For example, if you need to store a sound that consisted of 100000 samples at 16 bits-per-sample:

samp: make vector! [integer! 16 100000]

If you needed those to be unsigned integers:

samp: make vector! [unsigned integer! 16 100000]

Note that only 8, 16, 32, and 64 bit sizes are supported.

Instead of integers, if you want to store decimal (floating point) numbers:

decs: make vector! [decimal! 32 100000]

or:

decs: make vector! [decimal! 64 100000]

Note that decimals must be 32 or 64 bits and are always signed numbers.

Element Access

Currently, the only access to vector elements is via absolute indices. The series index methods (first, next, etc.) are not supported at this time.

To get a value:

v: pick vect 100

or,

v: vect/100

To set a value:

poke vect 100 10

vect/100: 10

The integer value for elements makes vectors more appropriate for mathematical functions such as those that deal with sound samples. For example, to generate a 16 bit sine wave sample:

wave: make vector! [integer! 16 3600]
repeat n 3600 [
    poke wave n 32767 * sine n / 10
]

We can support other native actions, as we find them to be useful. For example, we may want to provide a high-speed method of setting a sequence of values.

Initializers and Conversions

You can initialize a vector from a block using code such as:

vect: make vector! [integer! 32 [1 20 300 40 -50 ...]]

In addition, a direct conversion method will be supported:

vect: to vector! [1 20 300 40 -50 ...]

In such cases the type and size will be inferred from the data itself to create the most compact vector representation.

Conversions from vectors to blocks and binaries will also be supported.

blk: to block! vect
bin: to binary! vect

Details to be defined.

Dimensions

It is planned for vectors to support multiple dimensions. To define a two dimensional vector of 100 vectors of 365 integers.

samp: make vector! [integer! 32 100 365]

print samp/30/10
samp/50/200: 100

This is not yet supported, so let me know if you need it.

Bounds and Range Checks

Currently, index bounds are checked, but element ranges are not. For example, no error occurs if a 16 bit integer is set to 32 bits. The result is truncated. It is possible to add element range checks at the cost of additional performance overhead and code size.

Vector Operators

Vector operators like add, multiply, inverse, determinant, etc, are not planned for the first release of REBOL 3. They can be added later, if so desired.

Other Datatypes

In the future, it may be possible to allow other datatypes to be vectorized.

11 Comments

Comments:

John Niclasen
4-Apr-2007 3:41:56 It looks good, Carl! Reading this makes me think of vector calculations on supercomputers. That technology has become available to everyone these days with the Playstation3. At North Carolina State University, they've already built a cluster of 8 PS3 with 64 vector units in all: http://www.csc.ncsu.edu/news/news_item.php?id=464
Now, with the combination of rebcode, the vector datatype and OS-dependent parts of REBOL becoming open source, I see some interesting possibilities. Is the vector datatype implemented in REBOL in a way, so it's possible to utilize it (from rebcode), if and when we have REBOL on the PS3?
Carl Sassenrath
4-Apr-2007 11:57:34 Yes, and it will be quite interesting to try it.
Dave Cope
4-Apr-2007 13:58:52 I could really work with vectors. Could I cast my vote for vector operators as and when please. When you talk about other datatypes, is it realistic to ask for pair types? Your previous post mentioned about the possibility of the pair! type being non integer. This, coupled with vectors, would allow me to do geographic transformations in REBOL. (I could still work with pair! as it stands via scaled integers anyway).
Cheers
Brian Hawley
4-Apr-2007 16:55:49 It would be interesting to have some data-parallel operations to go with vector support in REBOL and rebcode. Then you could really make a PS3 fly, or anything with a vector unit like AltiVec or SSE.
Steeve
4-Apr-2007 17:59:41 Anyway, it would be interesting to have data-parallel operations to go with any type of series. why limit this feature for vectors only ? Vectors in Rebol is just a way to save space if i understand well.
Maxim Olivier-Adlhoch
5-Apr-2007 2:22:31 vectors :-) yes!!!! now we can start talking about HUGE lists using less memory, without resorting to costly series conversions to/from binary.
Q1: I am curious about a little word I noticed... 'UNSIGNED ... will that option be part of all integer handling in R3 or only within vectors?
Q2: is it reasonable to think that vectors will be addressable directly within struct types? so we can exchange large data sets directly within external libs or linked code? thinking about 3d models and UV maps for example, which are just (large) arrays of points in space. having them as vectors means we could address them directly AFAICT!
Brian Hawley
5-Apr-2007 20:23:51 Steeve, vectors are also a way to store data in formats that native code would understand better, particularly code that can be run on data-parallel hardware. Those kind of operations wouldn't work with block types, but would work just fine with strings, binaries and vectors.
I agree though that there should also be coarser-grained parallel operations in REBOL to operate on all series, but they should look like loops. The only semantic difference would be that the different iterations of the loop would logically operate at the same time, or at least in an undefined order.
Brian Hawley
5-Apr-2007 20:32:17 Maxim,
Q1: The reach of UNSIGNED in REBOL might be more limited than you think. Unsigned is a kind of static constraint on a value, and REBOL's typeless variables don't have that kind of constraint. It seems like a good idea for struct! fields though, and we'll see what kind of constraints get applied to object! fields with the new objects.
Q2: I agree, structs of vectors sound cool! So would vectors of structs. I wonder how these more complicated specifications would be stored and linked...
Maxim Olivier-Adlhoch
5-Apr-2007 23:51:18 brian, wrt Q2, I feel that this would be the best way to fill space within the struct. so its an allocated array within the struct as opposed to a pointer to such a space.
the reason being that since we can define the vector to being of different types, then we don't have to convert them when accessing the memory space.
is that what you understood by my question?
one issue I can see, is if a large vector is sure to live in one single contiguous garbage heap. if its spread out internally in different blocks of ram, then its not really the same as an actual ARRAY and cannot be used for structs I guess.
there is also always the question of byte ordering for myltibyte types and if REBOL uses the same as the code filling the struct. ':-/
James Nakakihara
9-Apr-2007 18:04:16 So Carl, when can we expect Rebol on a PS3? Me thinks it will be my future Amiga.
Henrik
14-Apr-2007 14:42:40 I was starting to think if vector! is useful for unicode text?