If I have a file which is in the form:
ID Temp Time
0 1 0.1
0 2 0.2
0 1 0.3
0 2 0.4
1 1 0.1
1 2 0.2
1 2 0.4
1 3 0.6
I want to add a third column which uses the difference in Value2 in adjacent rows, and multiplies it by Value 1 for rows with same ID, so the expected output would be:
ID Temp Time Calculated
0 1 0.1
0 2 0.2 0.1
0 1 0.3 0.2
0 2 0.4 0.1
1 1 0.1
1 2 0.2 0.1
1 2 0.4 0.4
1 3 0.6 0.4
Which I can almost get using the code:
awk 'NR>1{$4=(($3-p)*(q))} {p=$3} {q=$2} 1' input> output
But this runs over the change of ID giving:
ID Temp Time Calculated
0 1 0.1
0 2 0.2 0.1
0 1 0.3 0.2
0 2 0.4 0.1
1 1 0.1 -0.6
1 2 0.2 0.1
1 2 0.4 0.4
1 3 0.6 0.4
Except this should restart everytime the ID changes and not carry on, is there a way to add this to awk?
You may use this awk
:
awk -v OFS='\t' 'NR == 1 {print $0, "Calculated"; next} {print $0, (NR>2 && p1 == $1 ? ($3-p3)*p2 : "")} {p1 = $1; p2 = $2; p3 = $3}' file
ID Temp Time Calculated
0 1 0.1
0 2 0.2 0.1
0 1 0.3 0.2
0 2 0.4 0.1
1 1 0.1
1 2 0.2 0.1
1 2 0.4 0.4
1 3 0.6 0.4
To make it more readable:
awk -v OFS='\t' 'NR == 1 {
print $0, "Calculated"
next
}
{
print $0, (NR > 2 && p1 == $1 ? ($3 - p3) * p2 : "")
}
{
p1 = $1
p2 = $2
p3 = $3
}' file