Search code examples
lualpeg

Matching a string with a specific end in LPeg


I'm trying to capture a string with a combination of a's and b's but always ending with b. In other words:

local patt = S'ab'^0 * P'b'

matching aaab and bbabb but not aaa or bba. The above however does not match anything. Is this because S'ab'^0 is greedy and matches the final b? I think so and can't think of any alternatives except perhaps resorting to lpeg.Cmt which seems like overkill. But maybe not, anyone know how to match such a pattern? I saw this question but the problem with the solution there is that it would stop at the first end marker (i.e. 'cat' there, 'b' here) and in my case I need to accept the middle 'b's.

P.S. What I'm actually trying to do is match an expression whose outermost rule is a function call. E.g.

func();
func(x)(y);
func_arr[z]();

all match but

exp;
func()[1];
4 + 5;

do not. The rest of my grammar works and I'm pretty sure this boils down to the same issue but for completeness, the grammar I'm working with looks something like:

top_expr = V'primary_expr' * V'postfix_op'^0 * V'func_call_op' * P';';
postfix_op = V'func_call_op' + V'index_op';

And similarly the V'postfix_op'^0 eats up the func_call_op I'm expecting at the end.


Solution

  • Yes, there is no backtracking, so you've correctly identified the problem. I think the solution is to list the valid postfix_op expressions; I'd change V'func_call_op' + V'index_op' to V'func_call_op'^0 * V'index_op' and also change the final V'func_call_op' to V'func_call_op'^1 to allow several function calls at the end.

    Update: as suggested in the comments, the solution to the a/b problem would be (P'b'^0 * P'a')^0 * P'b'^1.