Search code examples
regexbraces

regular expression for content within braces


is there a regular expression to match the content within braces. For example with the following:

d = {'key': {'a': [1,2,3]}}

I would want to match {'key': {'a': [1,2,3]}} and {'a': [1,2,3]}, but not {'key': {'a': [1,2,3]}


Solution

  • In classical regular expressions, this is impossible - DFAs can't parse nested pairs.

    There are ways to do it with extended regular expressions, such as recursive expressions that are allowed in some regex engines (such as Perl regex), but they're not always pretty. (too much php provided the Perl version: /\{(?:[^{}]+|(?R))*\}/ with the (?R) option being the recursive match.)

    You don't necessarily need regex to do this kind of thing though. You could do it simply by walking through the list and keeping a stack of open braces (and what position they were seen at). Then whenever you see an open brace, you push its position onto the stack, and whenever you see a close brace, you pop the most recently seen open brace off the stack, and use its position plus the current position as the bounds for a substring which becomes one of your matches. Repeat until you reach the end of the string.