Search code examples
dateawksedcut

How to convert in file csv date in specific column to unix date


I have a file csv with this columns:

"Weight","Impedance","Units","User","Timestamp","PhysiqueRating"
"58.75","5.33","kg","7","2020-7-11 19:29:29","5"

Of course, I can convert the date command: date -d '2020-7-11 19:29:29' +%s

Results: 1594488569

How to replace this date in csv file in bash script?


Solution

  • With GNU sed

    sed -E '2,$ s/(("[^"]*",){4})("[^"]+")(.*)/echo \x27\1"\x27$(date -d \3 +%s)\x27"\4\x27/e'
    
    • 2,$ to skip header from getting processed
    • (("[^"]*",){4}) first four columns
    • ("[^"]+") fifth column
    • (.*) rest of the line
    • echo \x27\1"\x27 and \x27"\4\x27 preserve first four columns and rest of line after fifth column, along with adding double quotes to result of date conversion
    • $(date -d \3 +%s) calling shell command with fifth column value

    Note that this command will fail if input can contain single quotes. That can be worked around by using s/\x27/\x27\\&\x27/g.


    You can see the command that gets executed by using -n option and pe flags

    sed -nE '2,$ s/(("[^"]*",){4})("[^"]+")(.*)/echo \x27\1"\x27$(date -d \3 +%s)\x27"\4\x27/pe'
    

    will give

    echo '"58.75","5.33","kg","7","'$(date -d "2020-7-11 19:29:29" +%s)'","5"'
    


    For 58.25,5.89, kg, 7,2020 / 7/12 11:23:46, "5" format, try

    sed -E '2,$ s/(([^,]*,){4})([^,]+)(.*)/echo \x27\1\x27$(date -d "\3" +%s)\x27\4\x27/e'
    

    or (adapted from https://stackoverflow.com/a/62862416)

    awk 'BEGIN{FS=OFS=","} NR>1{$5=mktime(gensub(/[:\/]/, " ", "g", $5))} 1'
    


    Note: For the sed solution, if the input can come from outside source, you'll have to take care to avoid malicious intent as mentioned in the comments. One way is to match the fifth column using [0-9: -]+ or similar.