I have a variable containing the content of a file with multiple lines. The variable is parsed by a multitude of commands (awk
, sed
,...) which act as filters and post-processors of the variable.
echo "$variable" | awk1 | sed1 | awk2
The problem is not processing itself, but the fact that I modify the line in the process, losing track of the original value of the variable. The problem is that the final awk
does a conditional check which returns, depending on the outcome, the original variable, or not. And this is where my problem lays.
I assume it would be a good idea to create this variable with the original line after echo, but any of my attempts to inherit it in the following subshells have failed me. The solution has to be portable (meeting POSIX standard)
Format of variable:
John Smith - - [21/Mar/2017:09:24:33 +0100] Physics
Adam Miller - - [22/Feb/2019:09:24:33 +0100] Chemistry
I want to compare the dates in this file with a given date in YYYYMMDDHHMMSS
format (for example 20180101151515) and if a line contains date after, I want to print the whole line.
My code so far:
date_after="19960101151515"
process=$(echo "$variable" |awk -F' - - ' '{print $2}' | sed "s/Jan/01/; s/Feb/02/;
s/Mar/03/; s/Apr/04/; s/May/05/; s/Jun/06/; s/Jul/07/;
s/Aug/08/; s/Sep/09/; s/Oct/10/; s/Nov/11/; s/Dec/12/" | awk -F'[/:\\[ ]' -v date="$date_after" '{b=$4$3$2$5$6$7; if (b > date) {print $0}}')
Any combination of sed
, awk
, grep
, cut
, ... can generaly be replaced with a single awk
. This also allows you to store the original data and return it based on a condition.
You can easily see that the following awk
does the conversion you are interested in (first awk and sed)
awk '{ t=$0
match(t,"\\["); t=substr(t,RSTART+1)
match(t," ") ; t=substr(t,1,RSTART-1); split(t,a,"[/:]")
day=a[1]; year=a[3]; hhmmss=a[4]a[5]a[6];
month=sprintf("%02d",(match("JanFebMarAprMayJunJulAugSepOctNovDec",a[2])+2)/3)
print year month day hhmmss, t}'
So now you can plug in your conditional on t
and return the original $0
if need be:
awk -v d="$date_after" '
{ t=$0
match(t,"\\["); t=substr(t,RSTART+1)
match(t," ") ; t=substr(t,1,RSTART-1); split(t,a,"[/:]")
day=a[1]; year=a[3]; hhmmss=a[4]a[5]a[6];
month=sprintf("%02d",(match("JanFebMarAprMayJunJulAugSepOctNovDec",a[2])+2)/3)
}
(t > d) { print $0 }'
based on: convert month from Aaa to xx in little script with awk