Search code examples
bashperlawkseddate-format

how to change date-format in a log file using bash, avoiding while loop


This is not a new question here and here, but the details make it differ.

My input log file looks like:

TEMP MON -=- Sat Aug 15 02:20:24 EEST 2020 -=- 48.6
TEMP MON -=- Sat Aug 15 02:20:50 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:13 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:44 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:45 EEST 2020 -=- 48.6
TEMP MON -=- Sat Aug 15 02:21:52 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:53 EEST 2020 -=- 48.6
TEMP MON -=- Sat Aug 15 02:21:54 EEST 2020 -=- 49.6
TEMP MON -=- Sat Aug 15 02:21:56 EEST 2020 -=- 49.1
TEMP MON -=- Sat Aug 15 02:21:57 EEST 2020 -=- 49.1

and the output should look like:

TEMP MON -=- 2020-08-15_02:20:24 EEST -=- 48.6
...

So it is simple enough to change the format of a date in bash using

date -d ${date_in_current_format} "+DATE_IN_NEW_FORMAT"

It is also possible (albeit inefficient) to iterate over the log file using a while loop and change the dates line by line (see the 1st link again).

However, I am looking for a bash solution that uses sed or perl (or awk or anything else for that matter) to carry out the same task.

The tip of what I have tried but still does not work are the following search and replace functions:

perl -pe "s/(.*) -=- (.*) -=- (.*)/\1 -=- $( date \2 "+%Z %Y-%m-%d_%H:%M:%S" ) -=- \3/" <file>

and with sed something similar:

sed "s:\(.*\) -=- \(.*\) -=- \(.*\):\1 -=- $( date -d \2 "+%Z %Y-%m-%d_%H:%M:%S" ) -=- \3:" <file>

In both cases the problem is that I cannot get the search and replace substitution "\2" to be expanded within the bash date command execution.


Solution

  • With awk using only string functions, you can avoid calling the GNU awk datetime functions or the external command date, as we want to modify only the month and re-order the data.

    > cat tst.awk
    BEGIN { OFS=FS="-=-" }
    {
        split($2, arr, " ")
        m=(index("JanFebMarAprMayJunJulAugSepOctNovDec", arr[2])+2)/3
        $2=sprintf(" %04d-%02d-%02d_%s %s ", arr[6], m, arr[3], arr[4], arr[5])
        print
    }
    

    Usage:

    > awk -f tst.awk file
    TEMP MON -=- 2020-08-15_02:20:24 EEST -=- 48.6
    TEMP MON -=- 2020-08-15_02:20:50 EEST -=- 49.1
    TEMP MON -=- 2020-08-15_02:21:13 EEST -=- 49.1
    TEMP MON -=- 2020-08-15_02:21:44 EEST -=- 49.1
    TEMP MON -=- 2020-08-15_02:21:45 EEST -=- 48.6
    TEMP MON -=- 2020-08-15_02:21:52 EEST -=- 49.1
    TEMP MON -=- 2020-08-15_02:21:53 EEST -=- 48.6
    TEMP MON -=- 2020-08-15_02:21:54 EEST -=- 49.6
    TEMP MON -=- 2020-08-15_02:21:56 EEST -=- 49.1
    TEMP MON -=- 2020-08-15_02:21:57 EEST -=- 49.1