I'm developing a css parser but cannot figure out how to parse nested function calls like alpha(rgb(1, 2, 3), 0.5).
Here is my code:
# -*- coding: utf-8 -*-
from pyparsing import * #, Word, alphas, OneOrMore, countedArray, And, srange, hexnums, Combine, cStyleComment
from string import punctuation
from app.utils.file_manager import loadCSSFile, writeFile
from pyparsing import OneOrMore, Optional, Word, quotedString, delimitedList, Suppress
from __builtin__ import len
# divide tudo por espaços, tabulações e novas linhas
words = ZeroOrMore(cStyleComment | Word(printables))
digit = '0123456789'; underscore = '_'; hyphen = '-'
hexDigit = 'abcdefABCDEF' + digit
letter = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
az_AZ_underscore = underscore + letter;
az_AZ_09_underscore = az_AZ_underscore + digit;
az_AZ_hyphen = hyphen + letter;
az_AZ_09_hiphen = az_AZ_hyphen + digit;
LPAR = Suppress('('); RPAR = Suppress(')')
# identifiers
identifier = Combine(Word(az_AZ_underscore) + Optional(Word(az_AZ_09_underscore)))
# identifiers
identifier_reserved = Combine(Word(az_AZ_hyphen) + Optional(Word(az_AZ_09_hiphen)))
# numbers
hexadecimal = Word(hexnums, min=1)
integer = Word(digit, min=1)
decimal = Combine('.' + integer | integer + Optional('.' + integer))
# value values
color_hex = Combine('#' + hexadecimal )
at_identifier = Combine('@' + identifier)
arg = at_identifier | color_hex | decimal | quotedString
function_call = identifier + LPAR + Optional(delimitedList(arg)) + RPAR
value = Group(color_hex | at_identifier | function_call)
print(value.parseString('a(b())'))
I whould like to do something like arg = at_identifier | color_hex | decimal | quotedString | function_call but it's not possible because the variable is function_call is not declared yet.
How could I parse nested function calls using Pyparsing?
You are really very close. To define a recursive grammar like this, you need to forward-declare the nested expression.
As you have already seen, this:
arg = at_identifier | color_hex | decimal | quotedString
function_call = identifier + LPAR + Optional(delimitedList(arg)) + RPAR
only parses function calls with args that are not themselves function calls.
To define this recursively, first define an empty placeholder expression, using Forward():
function_call = Forward()
We don't know yet what will go into this expression, but we do know that it will be a valid argument type. Since it has been declared now we can use it:
arg = at_identifier | color_hex | decimal | quotedString | function_call
Now that we have arg defined, we can define what will go into a function_call. Instead of using ordinary Python assignment using '=', we have to use an operator that will modify the function_call instead of redefining it. Pyparsing allows you to use <<= or << operators (the first is preferred):
function_call <<= identifier + LPAR + Optional(delimitedList(arg)) + RPAR
This is now sufficient to parse your given sample string, but you've lost visibility to the actual nesting structure. If you group your function call like this:
function_call <<= Group(identifier + LPAR + Group(Optional(delimitedList(arg))) + RPAR)
you will always get a predictable structure from a function call (a named and the parsed args), and the function call itself will be grouped.