Search code examples
parsingrebolrebol3

Difference between Rebol 2 and Rebol 3 when mixing SOME in parse with CHANGE


Imagine a simplified example of a block of blocks containing words:

samples: [
    [a a c a]
    [a a c b]
    [b a c a]
    [c a c b]
    [c c c c]
]

Each block needs to be [c c c c]. So if a value is 'a, it is changed to 'b. If a value is 'b, it is changed to 'c. If a value is 'c, we print "C" and move on:

repeat i length? samples [
    prin ["^/Sample" i "- "]
    parse samples/:i [
        some [
            s: 'a (change s 'b) :s
            | s: 'b (change s 'c) :s
            | 'c (prin "C")
        ]
    ]
]

In Rebol 2, this works as expected:

Sample 1 - CCCC
Sample 2 - CCCC
Sample 3 - CCCC
Sample 4 - CCCC
Sample 5 - CCCC

But Rebol 3 seems to have a problem (bug?):

Sample 1 - 
Sample 2 - 
Sample 3 - 
Sample 4 - C
Sample 5 - CCCC

I don't know if it's related, but a Rebol Wikibook containing a list of changes to parse between Rebol 2 and Rebol 3 says this:

SOME subrule - to prevent unwanted infinite loops in R3 this rule stops also when the subrule matches the input but does not advance it

(Note: This simplified example provided by @rgchris in StackOverflow chat, repeated here to better preserve "institutional knowledge" and permit updating.)


Solution

  • If it does not actually matter whether you use ANY (0..n) or SOME (1..n), as is the case in your example, you can use WHILE in R3. WHILE basically matches R2's ANY:

    >> blk: [a a c a]
    
    >> parse blk [while [s: 'a (change s 'b) :s | s: 'b (change s 'c) :s | 'c]]
    == true
    
    >> blk
    == [c c c c]
    

    Alternatively, if that doesn't suffice because you really need SOME semantics, you could rewrite SOME using more basic primitives. Instead of rule: [some subrule] you can use rule: [subrule opt rule]:

     >> blk: [a a c a]
    
     >> subrule: [s: 'a (change s 'b) :s | s: 'b (change s 'c) :s | 'c]     
     >> parse blk rule: [subrule opt rule]
     == true
    
     >> blk
     == [c c c c]
    

    However, that might make you hit some PARSE limits which you won't hit with the original SOME (especially in R2).