Search code examples
shelldateawkshjulian-date

Need help in converting 'while-do' block to 'awk' block for quicker processing


I need 7th field of a csv file converted from julian(yyddd or yyJJJ) to yyyymmdd. I have the below while do loop. I need the same logic using awk command for quicker processing. Can someone help ?

count=0
while read -r line1; do
        col_7=$( echo $line1 | cut -d ',' -f7 | cut -c4-6)
        year1=$( echo $line1 | cut -d ',' -f7 | cut -c2-3)
        echo $col_7
        col_1=$( echo $line1 | cut -d ',' -f1,2,3,4,5,6)
        col_8=$( echo $line1 | cut -d ',' -f8 )
        date7=$(date -d "01/01/${year1} +${col_7} days -1 day" +%Y%m%d)
        echo $date7
        echo $col_1,$date7,$col_8 >> ${t2}
        count=$[count+1]
done < ${t1}

Input

xx,x,x,xxx,xxxx,xxxxx,021276,x  
xx,x,x,xxx,xxxx,xxxxx,021275,x  
xx,x,x,xxx,xxxx,xxxxx,021275,x  

Output

xx,x,x,xxx,xxxx,xxxxx,20211003,x  
xx,x,x,xxx,xxxx,xxxxx,20211002,x  
xx,x,x,xxx,xxxx,xxxxx,20211002,x  

Solution

  • Here is a solution for awk. This requires GNU awk for its time functions. Tested it on terminal, so it is pretty much a one-liner command.

    awk 'BEGIN { FS=OFS="," } { $7=strftime("%Y%m%d",mktime("20"substr($7,2,2)" 01 01 00 00 00")+(substr($7,4)*86400)-3600) } 1' filename.txt
    

    Explanations:

    • FS is field separator. Set it to ","
    • OFS is output field separator. Set it to ","
    • $7 is 7th field.
    • strftime(format, timestamp) is a builtin function to format timestamp in seconds according to the specification in format.
    • mktime(datespec) is a function to turn datespec into seconds. The format for datespec is YYYY MM DD HH MM SS.
    • substr($7,2,2) is to get the two-digit year.
    • substr($7,4) is to get the day. Because these functions take seconds as input, so a convertion to seconds is required.
    • 86400 is 24(hours) * 60(minutes) * 60 (seconds)
    • 36000 is a day. 60 (minutes) * 60 (seconds)
    • 1 is for printing the input line. Doesn't have to be 1. Anything other than zero is fine. If you like RPGs, you might want to change that to 999.
    • filename.txt is your input file.