Comments on: PARSE: do we need UNTIL?

Carl Sassenrath, CTO
REBOL Technologies
13-Oct-2009 23:09 GMT

Article #0271
Main page || Index || Prior Article [0270] || Next Article [0272] || 9 Comments || Send feedback

Ladislav has suggested UNTIL:

parse str [until a b]

This means: while rule a fails do rule b.

This is an iterative version of the common recursive pattern:

c: [a | b c]

which is used all over the place in language definition rules.

At first, I would ask is UNTIL really necessary?

Second, I would ask how difficult is it to write an iterative rule for it using the features already available?

And third, what's the precise definition of UNTIL (in terms of rules allowed with it.) For example, are these valid?

until not a b
until 3 a b
until set h 1 3 a b
until a 3 b
until a copy h b

Fourth, what happens if rule b fails? Does the UNTIL fail?

Unless these questions can be answered quite soon, and the definition be something easily implemented, it is not likely that UNTIL will be part of 3.0.



Brian Hawley
14-Oct-2009 1:08:46
Ladislav, you better speak up here.

One reasonable iterative equivalent of Ladislav's recursive until definition is this:

any [not a b] a
Except if the a has side effects, they shouldn't be repeated, and they will with that version. If you can split the a rule into matching and modifying portions, the modifiers can be put in the second instance.

Note that in this version, the until will only fail when the b fails and is not followed by the a. If the end is reached before the a is, the whole thing fails.

Carl Sassenrath
14-Oct-2009 1:31:14
Given the new behavior of ANY and SOME to loop forever (but see blog #272, this will change and there will be a new keyword for it), isn't UNTIL simply:

any [a break | b]

That assumes we don't care about the result of b. It's easy to modify the line if we do require b to succeed:

any [a break | b | break]
14-Oct-2009 6:53:24
My understanding is, that there were many requests from Parse newbies that could be reduced to until. (the last one I saw was from Giuseppe)

I have a suspicion, that they actually don't recognize it would do what they are asking for, so there are chances they will not discuss this topic.

As far as the to and thru multiple is discussed, the

a: [thru b]

rule is equivalent to

a: [until b skip]

rule, but in the until rule Parse can match any b pattern compared to just special cases in the thru rule.

Similarly, the

a: [to b]

rule is equivalent to

a: [until [and b] skip]

Another idea was to suppress the usage of break, since it does not improve the readability of code, especially compared to until.

Yes, Brian, your rule is equivalent, except for some special cases. The equivalent is a bit more complicated:

[any [not [a c: (d: [:c])] [b | (d: [fail]) fail]] d]

Carl, both your rules always succeed (if they finish, that is), since any always succeeds.

14-Oct-2009 7:46:45
Are these valid?

until not a b

If not, then it would not be a problem to write:

until [not a] b

The same can be said about all the other variants, I guess.

14-Oct-2009 7:49:24
"What happens if the rule b fails?" - similarly, as in the rule:

rule: [a | b rule]

if the rule b fails, the whole rule fails.

Brian Hawley
14-Oct-2009 14:14:50
Carl, the difference between my version and yours is that mine won't succeed if there is no a, so the end condition becomes a or end. I don't like that, I want a to be required, else for the until to fail.

There are some qualities of until as I would like it that would be trickier to implement in the workarounds:

  • The end condition should be required, and the rule should fail if it is missing (missed by Carl's equivalent).
  • If the end condition has side effects, then its full effects should only be executed once (missed by my equivalent, but not Carl's).
  • The inner loop should be the endless loop variant, in case the a or b rules are semantic instead of syntactic (missed by everyone as of blog #272).
  • Not be recursive, or else you'll blow the stack while XML parsing (missed by Ladislav's recursive version).
  • Be easy to use (missed by Ladislav's iterative equivalent).

Now, I've had to write code that ended up as complex as Ladislav's iterative equivalent when parsing HTML, and my main use of break in the past was to implement Carl's iterative equivalent. Never did the recursive version because of stack limits. I would use until a lot if it were implemented according to the criteria above.

Note, Ladislav: "if the rule b fails, the whole rule fails." Only if rule a (the end condition) doesn't succeed first. The inner rule b is treated like any b - the endless loop variant since we will only have one until operation.

14-Oct-2009 14:34:28
My judgement is very easy - everytime when I can see claims like: "you don't need that, it can be done by: "

[any [not [a c: (d: [:c])] [b | (d: [fail]) fail]] d]

... I ask myself - is there really a need to ask, if 'untill should be added? :-)

Carl Sassenrath
14-Oct-2009 19:05:08
Brian, in reply to your note above... I believe the form I posted allows semantic-only rules. But, not tested.

Also, now with A90, I think the correct iterative UNTIL as you defined above can be written as:

while [a break | b | reject]

Let me know if you agree. Not also that b can be semantic only.

Also, regarding blowing the stack in XML when recursing... any language definition that requires thousands of recursions sounds quite wrong (improperly defined), doesn't it?

Recursions in grammars almost always happen in tree branches... and I don't know of any language that would require more than a 1000, let alone the 42000 that PARSE now allows by default. But, I suppose I could be wrong. Some languages might be quite messy.

15-Oct-2009 8:04:32
Yes, the

while [a break | b | reject]

is equivalent to the original definition. I am quite curious, why the proposed until keyword is hard to implement, but the above is an adequate replacement.

If I had to use the recursive

rule: [a | b rule]

to search for something in a text, advancing the input by one character at average, then the rule would not be adequate for even a moderately sized text, while the iterative rule is OK.

Post a Comment:

You can post a comment here. Keep it on-topic.


Blog id:



 Note: HTML tags allowed for: b i u li ol ul font span div a p br pre tt blockquote

This is a technical blog related to the above topic. We reserve the right to remove comments that are off-topic, irrelevant links, advertisements, spams, personal attacks, politics, religion, etc.

Updated 25-Jun-2024 - Edit - Copyright REBOL Technologies -