I have files in the following format:
in file0.txt
234 ABC
123 KJN
245 MDJ
274 FWX
in file1.txt
45 ABC
1 KJN
546 MDJ
456 FWX
similarly, I have 24 files in total from file0.txt to file23.txt I am trying to join all these files into one file based on the 2nd column as the key. Desired output is:
ABC 234 45
KJN 123 1
MDJ 245 546
FWX 274 456
I wrote something like this:
for(i=0;i<=23;i++)
do
if[ -e file$i ]
then
join -1 2 -2 2 file$i.txt file`expr $i +1`.txt > fileJoined.txt
fileJoined.txt > file`expr $i +1`.txt
fi
done
The logic fails after the first run as the joined file structure is changed and the join tries to join it on second column . Please suggest how to correct this. Also is there any better way to achieve this? Please note I had carefully used a space in if [ -e file$i ]
Using awk:
$ awk '{a[$2]=a[$2] (a[$2]==""?"":OFS) $1}END{for(i in a)print i,a[i]}' file1 file2
Output:
ABC 234 45
FWX 274 456
MDJ 245 546
KJN 123 1
The script stores records to memory so you should have enough memory to handle all required files. Explained:
$ awk '{
a[$2]=a[$2] (a[$2]==""?"":OFS) $1 # hash all records to a, index with $2
}
END { # after hashing all records
for(i in a) # go thru all of them in random order
print i,a[i] # and output them
}' file1 file2