Search code examples
cparsinglemon

Custom deallocation function for tokens destructor in Lemon


I want Lemon to parse a simple C-like expression, supporting integer and string comparison over a predefined set of variables with known names. Let's assume it supports only string comparison, for simplicity. So, the following string is a good example of the expression kind I'm talking about:

a == "literal_1" || a == "literal_2"

So, my lexer would have to feed the parser with values it the following order:

void *p = parserAlloc(malloc);
parser(p, TOK_VARIABLE_A, NULL);
parser(p, TOK_OPERATOR_EQ, NULL);
parser(p, TOK_LITERAL, strdup("literal_1"));
parser(p, TOK_OPERATOR_OR, NULL);
parser(p, TOK_VARIABLE_A, NULL);
parser(p, TOK_OPERATOR_EQ, NULL);
parser(p, TOK_LITERAL, strdup("literal_2"));
parserFree(p, free);

I have to make duplicates of literal strings passed to parser, because they may contain escape sequences which I must decode first. But who is responsible to free the memory after the parsing is done? Fortunately, Lemon comes to the rescue with its %destructor directive, so I can write:

%token_destructor TOK_LITERAL { free($$); }

But in fact, I don't want to hard-code the usage of malloc, strdup and free in my parser and lexer. I want being able to pass allocator and deallocator functions as parameters, but use them not only in parserInit and parserFree, but also for token allocation and deallocation.

How can I declare additional parameter for parserAlloc to pass both malloc and free at the same time? There is the %extra_argument directive in Lemon, but it makes me pass my parameter every time I feed a token.


Solution

  • The malloc argument to parserAlloc is not stored anywhere, because the lemon-generated parser never allocates memory. [Note 1] And, of course, the free function is not stored anywhere either, because it isn't provided until you call parserFree.

    Normally, you won't need an alloc function inside a parser action either, but if you use %destructor/%token-destructor, then you will want a free function. The only documented mechanism to do that is the extra-argument feature, which, as you say, requires supplying the argument on every call to the parser. That's a bit annoying, particularly since the parser immediately stores it into the parser state structure (i.e. the first argument to parse), but that's the way it is. It would be easy to change, and Lemon is unencumbered so you can make the changes you want to. But as provided, %extra-argument is the only way.

    If you needed both alloc and free functions in your actions, for whatever reason, you could make the %extra-argument be a pointer to a struct (which is actually the normal case for %extra-argument); the struct would contain pointers to both functions. Alternatively, you could use a function with the standard realloc interface: realloc(NULL, sz) is equivalent to malloc(sz) and realloc(p, 0) is equivalent to free(p) (as long as p is not NULL). See man realloc for details. That won't bother the lemon parser, because it never uses either malloc or free.

    Notes

    1. That's not quite true. There is an as-far-as-I-know undocumented feature: if you set %stack-size 0, then the generated parser will reallocate the parser stack before it overflows instead of throwing an error. In that case, the parser uses the standard library realloc to allocate or reallocate the stack, not the malloc function provided to parserAlloc, and parserFree frees the stack with the standard library free, not the function passed as an argument.