Search code examples
erlangelixirtemplate-enginelexical-analysisleex

Is leex a good choice for writing a template engine lexer?


I am in the initial design phase of implementing a jinja2-like template language for Elixir. I had been inclined to writer the lexer by hand, but I have recently come across the leex module for Erlang. It looks promising, but after some initial research I am unsure if it is the proper tool for my purposes.

One of my hesitations is a template language being, essentially, a string embedded language, it is not clear how to use tokenize for this case using leex. As a trivial example, imagine tokenizing this template:

<p>Here is some text for inclusion in the template.</p>
{% for x in some_variable %}
  The value for the variable: {{ x }}.
{% endfor %}

In this example, I need to ensure that the kewords 'for' and 'in' are tokenized differently depending on:

  • If they are inside a tag: {% %}
  • If they are inside a tag: {{ }}
  • If they are in the template, but not inside any tags.

To me this looks like I would need to either do two passes in the tokenizing phase, or roll my own lexer in order to do this in one pass.

I am wondering if anyone who has experience with lexical analysis, and particularly leex, or writing template engines can provide some insight into the best way forward?


Solution

  • Let me apologize in advance if this isn't helpful, but I think of lexical analysis as having the power of regular expression and, as such, I suspect that what you are trying to do is not in the sweet-spot of RE's or Leex. First pass would be to go from source-code to lexical elements (tokens) which would be mostly be devoid of context and would be an appropriate use of Leex.

    I think the handling of the different, context-sensitive semantics of your FOR and IN tokens would be handled via parsing and Erlang's Yecc. You may be able to handle comments in the lexical analysis phase, but I think in general you might use a combination of Leex and Yecc.