Search code examples
bashunixsedawkgnu

remove seconds from time (awk, sed)


I have a file like this:

XX1, 1.1,24.08.1994 13:00:00, 111,112,113
XX2, 1.2,24.08.1994 13:30:00, 121,122,123
XX3, NaN,22.08.1995 15:00,    131,132,133

So the time format is not consistent. Some lines have a time like hh:mm:ss and some have a time-format hh:mm. I would like to remove the seconds and get a file like this:

XX1, 1.1,24.08.1994 13:00, 111,112,113
XX2, 1.2,24.08.1994 13:30, 121,122,123
XX3, NaN,22.08.1995 15:00, 131,132,133

What i tried so far is

#!/bin/bash
sed 's@,\(..\):\(..\):\(..\) @,\1:\2 @' < time_fault > ./time_corrected

and

#!/usr/bin/awk -f
BEGIN { RS="," ; FS=":"; ORS=","}
{ getline str
gsub(/*..:..:..*/,  $1":"$2 str) > time_corrected }

but both didn't work.


Solution

  • With sed only one capture group is needed:

    sed -re 's/([0-9]{2}:[0-9]{2}):[0-9]{2},/\1,/' -e 's/, +/, /g' file
    XX1, 1.1,24.08.1994 13:00, 111,112,113
    XX2, 1.2,24.08.1994 13:30, 121,122,123
    XX3, NaN,22.08.1995 15:00, 131,132,133
    

    Maybe awk is better.. only apply the substitution on third field if needed else remove extra spaces from the fourth:

    $ awk '{if ($3~/([0-9]{2}:){2}/) sub(/:[0-9]{2},/,",",$3);else sub(/ */,"",$4)}1'
    XX1, 1.1,24.08.1994 13:00, 111,112,113
    XX2, 1.2,24.08.1994 13:30, 121,122,123
    XX3, NaN,22.08.1995 15:00, 131,132,133