re match text enclosed in { } where text may contain {{var}}

I'm trying to create a regular expression to match a pattern of the form:

content = "identifier{ {{var1}} rest of the content} outer content identifier{ {{var2}} another content} identifier{ content with no vars }"

so supposedly I run re.findall on content the return value should be:

["{{var1}} rest of the content", "{{var2}} another content", "content with no vars"]

the pattern I want to match is identifier\{.*?\} but I don't know how to make it work as the enclosing pattern is included in the text I want to match so it either matches before the required place when I make it stingy, or it will merge the two patterns with each other when greedy, which is something I don't want.

Solution

If you want to allow multiple occurrences of {{var}} and not allow any other curly's:

{([^{}]*(?:{{[^{}]*}}[^{}]*)*)}

The pattern matches:

{ Match literally
( Capture group 1
- [^{}]* Match optional chars other than { and }
- (?: Non capture group
  - {{ Match literally
  - [^{}]* atch optional chars other than { and }
  - }} Match literally
  - [^{}]* Match optional chars other than { and }
- )* Close the non capture group and optionally repeat it
) Close group 1
} Match literally

Regex demo

To remove the leading and trailing whitespace you can use strip:

import re

pattern = r"{([^{}]*(?:{{[^{}]*}}[^{}]*)*)}"
content = "identifier{ {{var1}} rest of the content} outer content identifier{ {{var2}} another content} identifier{ content with no vars }"

print([s.strip() for s in re.findall(pattern, content)])

Output

['{{var1}} rest of the content', '{{var2}} another content', 'content with no vars']

If you want the capture group values without the surrounding whitespaces, and allow only a single opening and closing curly, you can use non greedy quantifiers with negative lookarounds:

(?<!{){\s*([^{}]*?(?:{{[^{}]+}}[^{}]*?)*)\s*}(?!})

Regex demo