Search code examples
bashawksed

remove line in text file with bash if the date is older than 30 days


I have a text file that looks like this:

test10 2016-05-30 207
test11 2016-06-01 207
test12 2016-07-20 207
test13 2016-07-21 207
test14 2016-07-25 207

And I want to remove the lines from the text file if that date is older than 30 days. How can I do this? I have read some aboud sed but not sure if it can be done or how to go about doing it.


Solution

  • The nice thing about YYYY-MM-DD is that its alpha sort is identical to its sort order as a date object -- so you can just generate a string representing the cutoff date and compare to that.

    If you have GNU date:

    cutoff=$(date -d 'now - 30 days' '+%Y-%m-%d')
    awk -v cutoff="$cutoff" '$2 >= cutoff { print }' <in.txt >out.txt && mv out.txt in.txt
    

    It's also possible to rely on GNU awk (gawk) rather than GNU date:

    gawk -v current="$(date +"%Y %m %d %H %M %S")" \
      'BEGIN {
         cutoff_secs = mktime(current) - (60 * 60 * 24 * 30)
       }
    
       {
         line_secs=mktime(gensub(/-/, " ", "g", $2) " 00 00 00")
         if (line_secs >= cutoff_secs) { print }
       }' <in.txt >out.txt && mv out.txt in.txt
    

    Note that the latter implementation starts at the current time 30 days ago, rather than at the beginning of the day 30 days ago; replace %H %M %S with 00 00 00 if you don't want this behavior.