Search code examples
awkgrepcsh

Awk between two dates in a logfile - almost working


I have a csh script which is trying to identify entries in a logfile between two dates

(in the script they are $start_date and $end_date entered as DD/MM/YYYY but I have simplified here)

more text_B_14_FEB_03.dt | grep TMYO 

TMYO140043J:=TMYO140043J     P33BJm SOLO            03/02/2014 
TMYO140044J:=TMYO140044J     P4m    FINL            03/02/2014 
TMYO140044M:=TMYO140044M     P3BJ   FINL            03/02/2014 
TMYO140045M:=TMYO140045M     P33BJq MARS            04/02/2014 
TMYO140046M:=TMYO140046M     P33BJq RENN            04/02/2014 
TMYO140047M:=TMYO140047M     P33BJl AKHT            05/02/2014 
TMYO140048M:=TMYO140048M     P3l    MACL            05/02/2014 
TMYO140049M:=TMYO140049M     P3q    HAYE            06/02/2014 
TMYO140050M:=TMYO140050M     P3q    ROCH            06/02/2014 
TMYO140051M:=TMYO140051M     P3q    FORR            06/02/2014 
TMYO140052L:=TMYO140052L     P3v    ROSE            07/02/2014 
TMYO140053L:=TMYO140053L     P3v    CAIR            07/02/2014 
TMYO140054L:=TMYO140054L     P3v    MURR            07/02/2014 

I have tried the following but it can't handle dates properly from previous year?

more text_B_14_FEB_03.dt | grep TMYO | awk '$5>="02/01/2013" && $5<="13/02/2014"'

TMYO140043J:=TMYO140043J     P33BJm SOLO            03/02/2014 
TMYO140044J:=TMYO140044J     P4m    FINL            03/02/2014 
TMYO140044M:=TMYO140044M     P3BJ   FINL            03/02/2014 
TMYO140045M:=TMYO140045M     P33BJq MARS            04/02/2014 
TMYO140046M:=TMYO140046M     P33BJq RENN            04/02/2014 
TMYO140047M:=TMYO140047M     P33BJl AKHT            05/02/2014 
TMYO140048M:=TMYO140048M     P3l    MACL            05/02/2014 
TMYO140049M:=TMYO140049M     P3q    HAYE            06/02/2014 
TMYO140050M:=TMYO140050M     P3q    ROCH            06/02/2014 
TMYO140051M:=TMYO140051M     P3q    FORR            06/02/2014 
TMYO140052L:=TMYO140052L     P3v    ROSE            07/02/2014 
TMYO140053L:=TMYO140053L     P3v    CAIR            07/02/2014 
TMYO140054L:=TMYO140054L     P3v    MURR            07/02/2014   

here it wrongly misses date entries from 03/02/2014 when i change start date to 04/01/2013??

more text_B_14_FEB_03.dt | grep TMYO | awk '$5>="04/01/2013" && $5<="13/02/2014"'

TMYO140045M:=TMYO140045M     P33BJq MARS            04/02/2014 
TMYO140046M:=TMYO140046M     P33BJq RENN            04/02/2014 
TMYO140047M:=TMYO140047M     P33BJl AKHT            05/02/2014 
TMYO140048M:=TMYO140048M     P3l    MACL            05/02/2014 
TMYO140049M:=TMYO140049M     P3q    HAYE            06/02/2014 
TMYO140050M:=TMYO140050M     P3q    ROCH            06/02/2014 
TMYO140051M:=TMYO140051M     P3q    FORR            06/02/2014 
TMYO140052L:=TMYO140052L     P3v    ROSE            07/02/2014 
TMYO140053L:=TMYO140053L     P3v    CAIR            07/02/2014 
TMYO140054L:=TMYO140054L     P3v    MURR            07/02/2014 

Any idea where is the awk part is going wrong? I appreciate perl is probably the most flexible answer to this, but my perk scripting is not there yet, and I would like to solve this using awk first.


Solution

  • You should transform date to a format YYYYMMDD so it can be lexicographilly ordered. You can do it with gawk and regex, or by doing substrings operations with awk. Here is the gawk way

    more text_B_14_FEB_03.dt | grep TMYO | gawk 'match($5, "([0-9]+)/([0-9]+)/([0-9]+)", ary) {B
    =ary[3] ary[2] ary[1]; if (B < 20140213 && B> 20130104) print }'