I'm trying to form a regex that would work both the below mentioned examples:
Example 1. 202101310000-daily
Example 2. my_merchant_df_20210129
The conditions are :
I'm regex: [0-9]+
is working for 202101310000-daily
, but I'm not able to create the regex that would satisfy both examples.
Basically for my use-case, there are groups of files that should have the date format of yyyymd
or yyyymmdd
or yyyymmddHHMMSS
. I need to filter out those files which have that kind of format anywhere in the file name using
regex.
You can repeat 3 or more sets of 2 digits, and assert no digits to the left and right of the match using negative lookarounds.
(?<!\d)(?:\d\d){3,}(?!\d)
(?<!\d)
Assert no digit directly to the left(?:\d\d){3,}
Repeat matching 2 digits 3 or more times to match 6 digits, 8 digits ect..(?!\d)
Assert no digit directly to the rightIn Java
String regex = "(?<!\\d)(?:\\d\\d){3,}(?!\\d)";
To make it a bit more specific, you could either start by matching 19 or 20 for the year and repeat 2 or more times sets of 2 digits or you could use an alternation to match all the exact patterns.
(?<!\d)(?:19|20)(?:\d{2}){2,}(?!\d)