Search code examples
regexoption-typeregex-group

Regular Expression: optional group is not working


I have this regEx:

\n1\s(\d{2,8})\s(\d{0,3}(.\d{3}),\d)\s(\w{1,10})\s(\d{0,3}(.\d{3}),\d)\s(\d{0,3}(.\d{3}),\d)\s(\w{3}).+?Ihre Art.-Nr.\s(\d+).+?(?:DeliveryDate:\s(\d{2}.\d{2}.\d{4})).+?(?:ExtraCharge.+?entspricht:\s(\d{0,3}(.\d{3}),\d)\s(\w{1,10}))

Works fine so far. It matches something like this:

1 123456 25,00 Stck 100,00 2.500,00 EUR

. . . some text

Ihre Art.-Nr. 1690431

DeliveryDate: 21.11.2019

. . . some text

incl.ExtraCharge

entspricht: 222,00 EUR

Now I want the bold parts to be optional (in some cases the values are missing in the document).

My idea was to just add a questionmark to the groups:

\n1\s(\d{2,8})\s(\d{0,3}(.\d{3}),\d)\s(\w{1,10})\s(\d{0,3}(.\d{3}),\d)\s(\d{0,3}(.\d{3}),\d)\s(\w{3}).+?Ihre Art.-Nr.\s(\d+).+?(?:DeliveryDate:\s(\d{2}.\d{2}.\d{4}))?.+?(?:ExtraCharge.+?entspricht:\s(\d{0,3}(.\d{3}),\d)\s(\w{1,10}))?

but it doesn't work, and I don't know why


Solution

  • If (?:DeliveryDate:\s(\d{2}.\d{2}.\d{4}))? is not matched, then the regex still expects the .+? before and after to still match.

    Try putting the ending .+? inside of the non-capturing group that you have for DeliveryDate. eg

    (?:DeliveryDate:\s(\d{2}.\d{2}.\d{4}).+?)?