I have 2 files. Sample values of file1 are as follows:
1313 0 60
1313 1 60
1314 0 60
1314 1 57
1315 1 60
1316 0 60
1316 1 57
1317 1 57
1318 1 57
1333 0 57
1333 1 57
1334 0 60
1334 1 60
Sample values of file2 are as follows:
813 0 91
813 1 91
814 0 91
814 1 91
815 0 96
815 1 91
816 0 91
816 1 91
817 1 96
818 0 91
832 0 96
833 0 91
833 1 91
834 0 96
I am trying to modify file1 and create a file3 with the following values (as you can see, the values in the last column of file1 are irrelevant):
1 0
1 1
2 0
2 1
3 1
4 0
4 1
5 1
6 1
21 0
21 1
22 0
22 1
Also, the file2 needs to be modified, and a file4 is to be created with the following values (the values in the last column of file2 are irrelevant):
1 0
1 1
2 0
2 1
3 0
3 1
4 0
4 1
5 1
6 0
20 0
21 0
21 1
22 0
After the creation of file3 and file4, I intend to check their similarity using the diff utility. To generate file3 and file4, I am trying to write an awk
script. But as a beginner to awk
, I find the task very time consuming. Any guidance would be greatly appreciated.
We can capture the value from $1
on the first row and then just use it in a formula to calculate the offset. This assumes the smallest $1
is in the first row.
awk 'NR==1 { i=$1 } { print $1-i+1,$2 }'
So for example, you can do:
awk 'NR==1 { i=$1 } { print $1-i+1,$2 }' file1 > file3
awk 'NR==1 { i=$1 } { print $1-i+1,$2 }' file2 > file4
diff file3 file4
This was my previous version before I noticed you were really looking for an offset. I had assumed you just wanted to change it based on the change in $1
. We can set up a variable to use to check value changes between rows and only increment the counter when $1
changes. This assumes that are grouped.
awk 'n!=$1 { i++ } { print i,$2 } { n=$1 }'
So for example, you can do:
awk 'n!=$1 { i++ } { print i,$2 } { n=$1 }' file1 > file3
awk 'n!=$1 { i++ } { print i,$2 } { n=$1 }' file2 > file4
diff file3 file4