Comments on: PARSE: allow switching input series?

Carl Sassenrath, CTO
REBOL Technologies
9-Oct-2009 4:50 GMT

Article #0265
Main page || Index || Prior Article [0264] || Next Article [0266] || 9 Comments || Send feedback

Allowing the LIMIT concept in parse is quite difficult. The problem is that it affects too many sub-functions, like those that find subseries. It won't be in 3.0.

However, as I thought about it more, you can do LIMIT yourself by chopping up the input series in advance.

You could even handle all the pieces in a single parse... except... if you attempt to switch the input series:

>> b: "this"
>> parse "test" ["test" :b "this"]
** Script error: PARSE - attempt to change input series: :b

That restriction comes from the fact that we often need to backtrack the input, and if you change the input series, backtracking becomes problematic.

However, REBOLers are smart coders... so, perhaps we should remove this single series restriction, and just let them do what they want.

What say you? Allow switching horses in midstream?



9-Oct-2009 4:25:43
I vote for allowing to switch horses in midstream, it is of advantage in some situations.
9-Oct-2009 4:34:41
Without thinking about possible problems, I'd say allow changing of horses. (For the problems someone more proficient with the internals has to have the final word.)
9-Oct-2009 7:28:27
Allow it, allow it !!!

Brian Hawley
9-Oct-2009 9:56:12
This is something I've wanted for years to implement incremental parsing, however...
That restriction comes from the fact that we often need to backtrack the input, and if you change the input series, backtracking becomes problematic.
Exactly how problematic? So problematic that there are no sensible workarounds that can be documented in the advanced parse docs? If there are no decent workarounds for the problems, I say no. Otherwise, go for it :)
Brian Hawley
9-Oct-2009 9:59:24
Sorry to hear about limit though. Avoiding having to chop up the series or multipass parsing was why I was in favor of limit when you proposed it in the first place.
Carl Sassenrath
9-Oct-2009 14:49:46
The problem is this: if you change the series but the rule fails, forcing a recovery to a prior index, it's still the new series. That is, we do not recover to the old series.

If advanced users are willing to live with that restriction, then this change can be made.

BTW, you can change the type of series being parsed:

b: [a b c]
parse "abc" ["abc" :b ['a 'b 'c]]

This is not the same as INTO (with the new mode.) INTO starts a new PARSE (with new state variables.) A get-word simply changes the input series.

Maxim Olivier-Adlhoch
9-Oct-2009 21:49:17
I'm in favor of allowing input switching.

this can speed up rules which use change, insert or remove a lot, but at a cost of a bit more know-how.

creating new series on the fly can be faster than letting the series get grown. Its even more substantial when the interim series is transient and will be changed over and over, exponentially so with long strings.

properly document the fact that backtracking offset are applied on the new input instead of the old and people have been warned.

I'm now wondering... how can be use backtracking on multiple series to our advantage... hehehe

Brian Hawley
13-Oct-2009 5:50:07
By the way, the "type of series" change doesn't work as promised yet - not between strings and blocks or vice versa. See bug#1263 for details.
Vincent Ecuyer
16-Oct-2009 4:59:44
Input switching would make parsing of big (or streaming) files more easy, as we wouldn't have to keep the whole data in memory, and could read it as needed, without losing the current parse state.

Post a Comment:

You can post a comment here. Keep it on-topic.


Blog id:



 Note: HTML tags allowed for: b i u li ol ul font span div a p br pre tt blockquote

This is a technical blog related to the above topic. We reserve the right to remove comments that are off-topic, irrelevant links, advertisements, spams, personal attacks, politics, religion, etc.

Updated 15-Apr-2024 - Edit - Copyright REBOL Technologies -