Search code examples
regexbashdate-range

bash generate a regexp to match any formated date between two dates


Using bash I would generate a regexp to match any formatted date between two dates. (I will later use it in restricted prod so I would use only bash as far as possible)

Random dates of course even crossing years for example the regexp might match any date between

  • 2022-12-27 and
  • 2023-02-05

so all of the dates

  • 2022-12-27
  • 2022-12-28
  • 2022-12-29
  • 2022-12-30
  • 2022-12-31
  • 2023-01-01
  • (…)
  • 2023-02-05

Time range comes from two parameters given as input to the future bash script.

Finaly will use that regexp to both find & manage filenames and to grep some datas.

Filename form-pattern are random but always contain a YYY-MM-DD time format, whatever the filename is from dd_YYYY-MM-DD.xx to aaaa__bbbb_dd_YYYY-MM-DD.xxx_ccc_zzz.log or anything else .

I tried to manage that by separating each year/month/day like

fromDate="$1"
toDate="$2"

# break the dates into array so that we can easily access the day, month and year
#
fdy=$( sed 's/.*\([0-9]\{4\}\).*/\1/' <<< $fromDate )
fdm=$( sed 's/.*-\([0-9]\{2\}\)-.*/\1/' <<< $fromDate )
fdd=$( sed 's/.*-.*-\([0-9]\{2\}\).*/\1/' <<< $fromDate )
#
edy=$( sed 's/.*\([0-9]\{4\}\).*/\1/' <<< $toDate )
edm=$( sed 's/.*-\([0-9]\{2\}\)-.*/\1/' <<< $toDate )
edd=$( sed 's/.*-.*-\([0-9]\{2\}\).*/\1/' <<< $toDate )

then to loop over that with some sort of

#[...]
printf -v date "%d-%02d-%02d" "${from[2]}" "${from[1]}" "${from[0]}"
    pattern="$pattern|$date"
    ((from[0]++))
    # reset days and increment month if days exceed 31
    if [[ "${from[0]}" -gt 31 ]]
    then
        from[0]=1
        ((from[1]++))
#[...]

but didn't find a way to work around and output a correct regexp matching any date inside the date range.


Solution

  • This is not a regex solution but may be helpful.

    When the date format is yyyy-mm-dd you can compare dates lexicographically.

    The string "2023-01-05" is greater than "2022-08-05" because the first character they don't have in common is greater in the first string.

    In bash you can therefore just do the comparison [[ "$date" > "$from" ]] && [[ "$to" > "$date" ]] to see if a date in yyyy-mm-dd string format is within the range $to-$from