I want to get the remaining difference between two files that have redundant entries.
File1.txt:
Data1
Data1
Data2
Data2
Data3
Data3
Data3
Data3
Data4
Data5
Data6
Data6
and
File2.txt:
Data1
Data2
Data2
Data3
Data3
Data4
Data5
Data6
Finalfile.txt:
Data1
Data3
Data3
Data6
In other words: if an entry shows up n times in file 1 and m times in file 2 then, the final file should contain the n-m entries. Ie: See there are four entries of Data3 in File1.txt and only two entries in File2.txt, therefore the Finalfile.txt has 2 occurances of Data3.
I've tried:
grep -v -f File1.txt File2.txt > Finalfile.txt
but it give the absolute differences.
You may use this 2 pass awk
solution:
awk '
NR == FNR {
++fq[$1]
next
}
{
--fq[$1]
}
END {
for (s in fq)
for (i = 1; i <= fq[s]; ++i)
print s
}' file1 file2
Data1
Data3
Data3
Data6