Modifying and Comparing similarity of two files

I have 2 files. Sample values of file1 are as follows:

Sample values of file2 are as follows:

I am trying to modify file1 and create a file3 with the following values (as you can see, the values in the last column of file1 are irrelevant):

Also, the file2 needs to be modified, and a file4 is to be created with the following values (the values in the last column of file2 are irrelevant):

After the creation of file3 and file4, I intend to check their similarity using the diff utility. To generate file3 and file4, I am trying to write an awk script. But as a beginner to awk, I find the task very time consuming. Any guidance would be greatly appreciated.

Solution

We can capture the value from $1 on the first row and then just use it in a formula to calculate the offset. This assumes the smallest $1 is in the first row.

awk 'NR==1 { i=$1 } { print $1-i+1,$2 }'

So for example, you can do:

awk 'NR==1 { i=$1 } { print $1-i+1,$2 }' file1 > file3
awk 'NR==1 { i=$1 } { print $1-i+1,$2 }' file2 > file4
diff file3 file4

This was my previous version before I noticed you were really looking for an offset. I had assumed you just wanted to change it based on the change in $1. We can set up a variable to use to check value changes between rows and only increment the counter when $1 changes. This assumes that are grouped.

awk 'n!=$1 { i++ } { print i,$2 } { n=$1 }'

So for example, you can do:

awk 'n!=$1 { i++ } { print i,$2 } { n=$1 }' file1 > file3
awk 'n!=$1 { i++ } { print i,$2 } { n=$1 }' file2 > file4
diff file3 file4