Search code examples
awktimestampepoch

AWK: convert timestamp to epoch; first record always returns -1


I have an input file with timestamps and odometer readings, like so:

2017-09-16 18:14:00,80465
2017-09-19 18:23:00,80898
2017-09-21 08:05:00,81253
2017-09-27 18:20:00,82155
2017-10-03 18:36:00,82902
2017-10-09 18:33:00,83699

... and would like to add the timestamp converted to an epoch, with the following code:

BEGIN {}
{
    # change timestamp to epoch (without TZ correction)
    epoch_returned = timestamp_to_epoch($1)
    printf("%s,%s\n", $0, epoch_returned)
}

function timestamp_to_epoch(timestamp_in) {
    FPAT = "[0-9][0-9]"
    epoch_out = mktime($1$2" "$3" "$4" "$5" "$6" "$7)
    return epoch_out
}

The output produces this:

2017-09-16 18:14:00,80465,-1
2017-09-19 18:23:00,80898,1505809380
2017-09-21 08:05:00,81253,1505945100
2017-09-27 18:20:00,82155,1506500400
2017-10-03 18:36:00,82902,1507019760
2017-10-09 18:33:00,83699,1507537980

The first line always returns -1. I have deleted the first line in the input file, the (then) first line still returns -1. I left only one line in the input. Also returns -1.

If I enter an empty line in the input file, it returns -1, while all other add the epoch as expected.

I have looked at this for quite a while and can't figure it out.

I am using GNU Awk 5.0.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.2.0) onLinux Mint 20.1 Cinnamon 4.8.6.

Any hints appreciated.


Solution

  • If you wanted to use FPAT to split your input into fields then you'd be using it in the wrong place, it'd have to be in the BEGIN section, but that's NOT what you're trying to do, your data is ,-separated and you're just trying to separate your timestamp into 2-digit segments. To do that you'd use patsplit() instead of FPAT (both gawk-only, just like mktime()):

    $ cat tst.awk
    BEGIN { FS=OFS="," }
    {
        # change timestamp to epoch (without TZ correction)
        epoch_returned = timestamp_to_epoch($1)
        print $0, epoch_returned
    }
    
    function timestamp_to_epoch(timestamp_in,       t, epoch_out) {
        patsplit(timestamp_in,t,/[0-9][0-9]/)
        epoch_out = mktime(t[1] t[2] " " t[3] " " t[4] " " t[5] " " t[6] " " t[7])
        return epoch_out
    }
    

    $ awk -f tst.awk file
    2017-09-16 18:14:00,80465,1505603640
    2017-09-19 18:23:00,80898,1505863380
    2017-09-21 08:05:00,81253,1505999100
    2017-09-27 18:20:00,82155,1506554400
    2017-10-03 18:36:00,82902,1507073760
    2017-10-09 18:33:00,83699,1507591980
    

    but personally I'd use plain, old split() instead of patsplit():

    split(timestamp_in,t,/[- :]/)
    epoch_out = mktime(t[1] " " t[2] " " t[3] " " t[4] " " t[5] " " t[6])