I have two text files, 1.txt, and 2.txt.
1.txt
https://soundcloud.com/track1,artisturi067
https://soundcloud.com/track4,artisturi428
https://soundcloud.com/track72,artisturi023
https://soundcloud.com/track22,artisturi181
2.txt
artisturi181
artisturi428
artisturi172
artisturi096
And I'm looking for a way to compare lines from the 1.txt,2 column with the whole file lines from the 2.txt, resulting in something like this:
3.txt
https://soundcloud.com/track1,artisturi067
https://soundcloud.com/track72,artisturi023
Python, Bash for Windows, or even Powershell would be helpful.
If you have the join
command installed, you could do this:
$ join -t, -12 -21 -v1 -o1.1 -o1.2 <(sort -t, -k2 1.txt) <(sort 2.txt)
https://soundcloud.com/track72,artisturi023
https://soundcloud.com/track1,artisturi067
-t,
: sets the record separator to be a comma, since that's what file 1.txt uses-12 -21
: says to join the 2nd field of the 1st file (-12
) with the 1st field of the 2nd file (-21
)-v1
: tells join
to only output those rows in the 1st file that have no match in the 2nd file-o1.1 -o1.2
: says that we want to output the 1st and 2nd fields of the 1st file<(sort -t, -k2 1.txt)
: since join
requires sorted files as inputs, we use process substitution to sort the file 1.txt based on its 2nd key (-k2
) and using comma as a delimiter (-t,
)<(sort 2.txt)
: similarly, we sort the 2nd input file, but since it contains a single column, we don't have to specify either a separator, or a key, as we did for the previous file