Search code examples
pythonregexflaskrouteswerkzeug

Rule for capturing types and/or parameterized paths from werkzeug/Flask routing rules


I have a set of Flask routes

/<string:name>/<path:id>/
/<name>/<path:id>/ 
/<string:name>/<id>/

I want to use a regex to extract name and id

/{name}/{id}/
/{name}/{id}/ 
/{name}/{id}/

For all of them (one regex to rule them all), to work for paths having type:path like /<string:name>/ and also those without type like /<name>/

But I am trying with:

(<(.*?\:)?(.*?)>)

which can only match

/{name}/{id}/
/{id}/  # <--- Why this is not matching /{name}/{id}/
/{name}/{id}/

Any REGEX expert to help?

Online REGEX: https://regex101.com/r/iL3jK2/3
The issue: https://github.com/rochacbruno/flasgger/issues/10


Solution

  • I suggest using

    (<([^<>]*:)?([^<>]*)>)
    

    The regex demo is here. Not sure you really need the outer (...) (only if you use it with re.findall, but you can remove them and use re.finditer and access all match using match.group(0)).

    Explanation:

    • <([^<>]*:)? - optional Group 2 matching
      • < - a literal <
      • [^<>]* - zero or more characters other than < and >
      • : - a literal :
    • ([^<>]*) - Group 3 matching zero or more characters other than < and >
    • > closing >

    Your pattern is rather greedy as .*? matches as many characters as necessary to get to the first :. Thus, it goes straight to the id ignoring name. When using the negated character class [^<>], we are sure we won't go through the > and match the name inside the first pair of <...>.