Search code examples
c#pythonregexlookbehind

Fixed Length Regex Required?


I have this regex that uses forward and backward look-aheads:

import re
re.compile("<!inc\((?=.*?\)!>)|(?<=<!inc\(.*?)\)!>")

I'm trying to port it from C# to Python but keep getting the error

look-behind requires fixed-width pattern

Is it possible to rewrite this in Python without losing meaning?

The idea is for it to match something like

<!inc(C:\My Documents\file.jpg)!>

Update

I'm using the lookarounds to parse HTTP multipart text that I've modified

body = r"""------abc
Content-Disposition: form-data; name="upfile"; filename="file.txt"
Content-Type: text/plain

<!inc(C:\Temp\file.txt)!>
------abc
Content-Disposition: form-data; name="upfile2"; filename="pic.png"
Content-Type: image/png

<!inc(C:\Temp\pic.png)!>
------abc
Content-Disposition: form-data; name="note"

this is a note
------abc--
"""

multiparts = re.compile(...).split(body)

I want to just get the file path and other text when I do the split and not have to remove the opening and closing tags

Code brevity is important, but I'm open to changing the <!inc( format if it makes the regex doable.


Solution

  • For paths + "everything" in the same array, just split on the opening and closing tag:

    import re
    p = re.compile(r'''<!inc\(|\)!>''')
    awesome = p.split(body)
    

    You say you're flexible on the closing tags, if )!> can occur elsewhere in the code, you may want to consider changing that closing tag to something like )!/inc> (or anything, as long as it's unique).

    See it run.