I'm trying to make a regex string that extracts data from report files. The tricky part is that I need this single regex string to match multiple report file content formats. I want the regex to always match even if some optional groups are not found.
Take the following report files content (Note: #2 is missing the "val2" part.):
- File #1: "-val1-test-val2-result-val3-done-"
I tried the following regex strings :
Regex #1(Normal): "-val1-(?<val1>.+?)-val2-(?<val2>.+?)-val3-(?<val3>.+?)-"
Problem: File #1 works fine but on file #2, the regex is not matching so I don't have any group values.
Regex #2(Non greedy)): "-val1-(?<val1>.+?)(-val2-(?<val2>.+?))?-val3-(?<val3>.+?)-"
Regex #3(Boolean OR): "-val1-(?<val1>.+?)(-val2-(?<val2>.+?)|(.*?))-val3-(?<val3>.+?)-"
Regex #4(Conditional): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))|(.+?))-val3-(?<val3>.+?)-"
Regex #5(Conditional): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))(-val2-(?<val2>.+?)))-val3-(?<val3>.+?)-"
Regex #6(Conditional): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))(-val2-(?<val2>.+?))|(.+?))-val3-(?<val3>.+?)-"
Problem: File #2 works as expected but the val2 group of file #1 is always empty.
Conclusion: The behavior seems to be that even if an optional group is present, the regex will prioritize an empty group value over the present value. Is there a way to force getting the optional groups' value when they are present and only return (empty) when they're not?
Note: I'm using the latest .NET framework and the code will ported to Java(Android). I'm trying to avoid using multiple operations for performance and bandwidth concerns.
Anyone could help me on this?
It is possible if we make some assumptions:
-val1-([^-]+)(?:-val2-([^-]+)|)(?:-val3-([^-]+)|)-