I have a regex that parses filenames extracting an optional ID and a product name. It only works partially and I wonder why that is the case and how I would fix this.
The files look like this (simplified):
So there is garbage (abc...def) in front of the @ sign with an optional ID. In reality the ID is more complicated (not just numbers) but has a fixed format with a fixed length. The complete part with the @ is optional as well.
This is regex that nearly works:
^(.*?(?<id>\d{2}).*?@)?(?<product>.*)\.\w+$
It works for case 1 and 3. As soon as I add another ? for the ID to also match case 2 the first case stops working.
Regex I thought would work:
^(.*?(?<id>\d{2})?.*?@)?(?<product>.*)\.\w+$
This extracts the ID, but it must be present
Can anyone explain to my why the second regex does not exract the ID and what I can do to fix it?
Thanks!!
/^(.*?(?<id>\d{2}?).*?@)?(?<product>.*)\.\w+$/gm
you need to add the ? non-greedy inside the capturing group the reason being
.*?(?<id>\d{2})? - if you use it outside the capturing group it matches the previous token i.e your ID along with `.*` in front of capturing group i.e abc
.*?(?<id>\d{2}?) - here it will match previous token i.e only your 2 digit ID