I have a series of log files which use csv
format, and with the first field on each line consisting of timestamp surrounded by double quotes, as below:
"2018-10-12 00:08:28",248351,1659.912,1.145031,6.180728
"2018-10-13 02:14:16",248486,243.657,0.513548,9.661507
"2018-10-13 22:31:52",248920,1014.364,0.357985,4.153846
"2018-10-14 06:19:31",249035,629.172,1.668043,8.029534
I am using a bash
script to manipulate these log files, and including awk
to select records within a specified range based on the timestamp. The double quotes do not playing nicely so I have to escape them as below to extract the appropriate rows:
awk '
BEGIN { FS=","; ts="\"2018-10-13 00:00:00\""; st="\"2018-10-14 00:00:00\"" }
$1>=ts && $1<st { print $0 }
' $file.in > $file.out
I would like to instead specify the timestamp as a parameter to my shell script rather than to have them hard-coded in the script, however I haven't been able to figure out how to make this hand-off to awk
within the script, especially when accounting for the necessary double quotes in the field value.
In my bash
script, I tried to create variables ts
and st
with the timestamp strings representing starting and ending bounds, then reference these variables within the later call to awk
.
ts="\"2018-10-13 00:00:00\""
st="\"2018-10-14 00:00:00\""
This doesn't work:
awk '
BEGIN { FS=","; ts=${ts}; st=${st} }
$1>=st && $1<st { print $0 }
' $file.in > $file.out
Neither does this:
awk '
BEGIN { FS="," }
$1>=${ts} && $1<${st} { print $0 }
' $file.in > $file.out
I suspect there may be two issues here:
bash
script argument (or bash
variable) in my awk
command?Variables aren't expanded inside single quotes. The correct way is to use the -v
option to awk
to initialize variables:
awk -v ts="$ts" -v st="$st" -F, '$1 >= st && $1 < st' "$file.in" > "$file.out"
Note also using -F
to initialize FS
, and you don't need { print $0 }
since that's the default action when the condition is true.