REBOL 3.0

Comments on: Range Datatype?

Carl Sassenrath, CTO
REBOL Technologies
7-Feb-2007 20:50 GMT

Article #0058
Main page || Index || Prior Article [0057] || Next Article [0059] || 12 Comments || Send feedback

Ok, let's open the discussion about a range datatype. There have been prior talks about it. Various users are in favor of it.

The purpose of this article is to draw a conclusion with regard to 3.0 implementation.

Definition

By range I mean a series interval not a time interval.

The possible advantage of a range specification can be seen if we look at the common expression for series extraction:

copy/part at series start length

an example being:

str: "string"
str2: copy/part at str 2 3

A range datatype would shorten this to:

str2: copy/part str 2..5  ; result is "trin"

This is also valid:

range: 2..5
str2: copy/part str range

The make form of the above literal range type is:

range: make range! [2 5]

Of course, other series functions (remove, insert, etc.) could make use of the range type as well.

Powerful, dangerous?

So... it is possible for range to embed the series reference itself. This is where things get more powerful and interesting. If we allow:

str: "string"
substr: as-range str 2..5

Then, we could do this:

str: copy substr      ; "trin"
append "name" substr  ; "nametrin"

As you expect, as-range above is short for:

make range! reduce [str 2 5]

Such an implementation draws us into discussion about the meaning of forms such as:

substr: as-range str 2..5
print substr/2  ; result is r

But that's the easy one. Even more thought provoking is the issue of the datatype identity of the range above, since any series datatype could be referenced by the range. Example:

if range? substr [print "yes of course"]
if string? substr [print "probably not true, it is range!"]
if string? type-of? substr [print "a possible approach"]

So, you can see the issue. We need to think this out carefully... or even if it should be allowed.

Maybe 3.N

I will also note that this datatype is deep and odd enough that, if implemented at all, it may be pushed to a 3.n rather than 3.0.

12 Comments

Comments:

tomc
7-Feb-2007 17:28:25
confirming you are thinging they are relative not absolute

and negative and backwards ranges ...

range: -1..1 ;;; what happend when the range exceeds head/tail

range 5..3

;;; does it reverse the sub-series

tomc
7-Feb-2007 17:29:32
confirming you are thinking they are relative not absolute

and negative and backwards ranges ...

range: -1..1 ;;; what happend when the range exceeds head/tail

range 5..3

;;; does it reverse the sub-series

Maxim Olivier-Adlhoch
7-Feb-2007 18:23:01
carl, instead of making a new datatype, it would be much better for series to have range notation instead.

in terms of code:

; instead of 
substr: as-range str 2..5
;which isn't much better than (exact same # bytes)
substr: copy/part str 2 5
;why not
substr: str/2:5

this really is only an expansion of current path notation.

why not add a third value for skip size:

str/2:5:2

ommiting a value would indicate head or tail, but because of current path notation, this would mean changing the notation a bit:

; head to fifth
str/[:5]
; second to tail
str/[2:]

obviously, words and parens should also be part of the notation

; extract file name prefix
str/(find/last "/"):(find/last ".")
; direct word evaluation
str/start:end

As I was reading your idea, I really was wondering when and where I would use the range type... and since it does not dramatically reduce the code (in fact increasing it in some cases), I'd probably forget about it (as in forgetting its there cause I'd use it about twice a year) and continue using the copy since I use that every day.

a bit like the hash and list types... which really are not used that much, simply because they don't add more expressiveness to REBOL.

Brian Hawley
7-Feb-2007 19:08:29
Here is a proposal for the semantics of range!
  • A range would logically have four attributes: start, end, position and subject.
  • A range would be an immutable reference to a possibly mutable chunk of data, just the way other series references are now. Remember, NEXT A returns another reference to the same series that A references - it doesn't change A. You also can't change the subject of a range.
  • You could use a range anywhere you can use a regular series type. For that matter, a range should count as a series type: Depending on its subject type, it should count as an any-string! or an any-block! type.
  • The subject would be reference to another series (like as-binary), not a copy. If not specified (or specified as none) then the subject would be the set of numbers between the start and end numbers, inclusive: a number line. If the subject is another range, it would just point to the other range's subject, with the start and end relative to the subject range's start and end.
  • It would be preferable for the start and end to be treated as references to positions on the subject, not offsets. Changes to the contents of the subject that change its length would change the offsets of start and snd so that they would continue to point to the same values in their new positions. In other words, they would act like list! references, not block! references.
  • If the subject can be inserted into or changed then any changes to the range would be propagated to the subject. The range would be expanded or contracted accordingly, as explained above. You wouldn't be able to modify the number line when no subject is specified.
  • If the start is greater than the end, the returned values would go in reverse.

Given these semantics, here's my wish list:

  • It would be cool if you could do a range on a port. It may have to be limited to direct ports, but I prefer it not be.
  • It would be be absolutely cool if you could parse a range. Forget cool: Essential.
  • You could optimize some native functions to be faster on ranges. For instance, FOREACH and REPEAT on a range would be like an optimized FOR loop.

Does this make sense?

Anton Rolls
7-Feb-2007 20:09:46
Makes sense to me, Brian. Like it. But I was thinking maybe all series could have a Range attribute added alongside the index ?
So,
    str: next range "string" 3
    index? str ;== 2
    range? str ;== 3
    probe str ;--> "tri"
Jeff M
7-Feb-2007 22:23:20
Ranges are somewhat useful, but not so much so to be included in the language (IMO). However, intervals can be incredibly useful (think Smalltalk).

Intervals could be used as ranges where the step or increment value is defaulted to 1 or -1 depending on the range. Consider the following code samples:

i: make interval [
    start: 1
    end: 5
    by: 2
]

>> for each i [ print each ] 1 3 5 >> copy/part "Hello, world!" i == "Hlo" >> contains i 2 == false

Intervals have far more uses than just a simple range. Having infinite intervals would be quite slick. But, if that required lazy evaluation, that's a no-no for REBOL.
Goldevil
8-Feb-2007 2:34:30
Just a remark about notation. With 'parse (and more precisely 'charset) we are already using a range notation but with a different syntax :

alphanum: charset [ #"A" - #"Z" #"a" - #"z"]

Maybe Rebol 3 can introduce a unique notation for both.

alphanum: charset [ #"A"..#"Z" #"a"..#"z"]

Paul T.
8-Feb-2007 9:02:33
I'm all in favor of a range datatype. I thought for a long time that it was a powerful feature that could be added to REBOL especially considering the other powerful series handling functions. It just seemed out of place to not have a range datatype to compliment the rest.

Maxim Olivier-Adlhoch
8-Feb-2007 23:55:46
I don't see why we need a datatype to express a range... I mean all series have a range inbuilt... why not just expose them as attributes of those series.

somehow, to me we are trying to complexify something that's inherently simple. and already within. series already have a soft end and soft head. why not allow us to also play around with the soft end, just as we do with the soft start (skip, next, at, etc).

I just don't see why going through an actual datatype allocating something, binding it and all would help things like the parser.

Series can be pre allocated to be larger than what we can edit... using the library forces us to preallocate strings, otherwise they stay at 0 length even if they are filled by the external code.... why not allow us to edit this end of a series directly? This removes the point of a range type, especially, like brian points out, that when you point to another serie, you already share the same mem area. so series already act as range types.. with the limit that we can't currently edit the end.

extending (completing) series is better than adding complexity by layering something over to duplicate what's almost already there.

no?

Robert
10-Feb-2007 7:49:56
If we add it please don't narrow it down to series in the Rebol sense. A range can be seen as an interval that can be used for calculations. Working on the series of numbers. And it should support decimal! as well.

Hence I don't know how to write it (in the EU it's possible)

1,23..2,56 would be OK but looks ugly.

1.23..2.56 can be parsed buy is even more ugly.

As a range can be defined with a block notation. How about:

my-range: [1 .. 2]

And than I can do: my-range/1 my-range/2

probe my-range * 3 [3 .. 6]

Gregg Irwin
15-Feb-2007 16:59:31
To save some typing...

http://www.rebol.org/cgi-bin/cgiwrap/rebol/ml-display-message.r?m=rmlGGHC

The value of a range that can only refer to a subset of a series could be cool, and probably useful, but I've never wanted or needed it that I can recall; not badly enough to write something that does it anyway.

I think the iterator/generator approach is a powerful tool and, while mine isn't lazy (I focused more on the dialect aspect, and keeping it simple), that's the direction I'd prefer.

felix
4-Mar-2007 8:53:59
is it needed in the language as a construct?
i don't think so. the only thing that you need as a primitive is sequence generator function and the range can be coded as simple function:

ex (pseudo code, asume right to left eval):

 til 5 -> 0 1 2 3 4
 2+til 5 -> 2 3 4 5 6
 2*2+til 5 -> 4 6 8 10 12 

this is what others do... :) keep it simple.

Post a Comment:

You can post a comment here. Keep it on-topic.

Name:

Blog id:

R3-0058


Comment:


 Note: HTML tags allowed for: b i u li ol ul font span div a p br pre tt blockquote
 
 

This is a technical blog related to the above topic. We reserve the right to remove comments that are off-topic, irrelevant links, advertisements, spams, personal attacks, politics, religion, etc.

REBOL 3.0
Updated 18-Apr-2024 - Edit - Copyright REBOL Technologies - REBOL.net