Basically I tried to operate files in command line like this:
File1:
,1,this is some content,
,2,another content,
,3,blablabla,
,4,xxxxxxxx,
,5,yyyyyyyy,
,6,zzzzzzzzzz,
... ...
File2:
1
3
4
5
Now I want to keep the content of file1 with the same column numbers in file2, so the output should be:
,1,this is some content,
,3,blablabla,
,4,xxxxxxxx,
,5,yyyyyyyy,
I used comm -3 file1 file2
but it doesn't work. Then I tried sed but also didn't work. Is there any other handy tool?
The following will work on the example as given - it won't work if numbers appear in your string after the comma:
grep -F -f File2 File1
An alternative would be
join -t, -1 2 -2 1 -o 1.1, 1.2, 1.3 File1 File2
Here is how that works:
-t, considers the `,` as terminator -1 2 look at the second column in file 1 -2 1 look at the first column in file 2 -o 1.1, 1.2, 1.3 output the first, second, third column of file 1
This still has the drawback that if there are multiple commas in the text that follows, it terminates after the first comma ("field 3" is the last one output).
Fixing that issue requires the use of xargs
:
join -t, -1 2 -2 1 -o 1.1, 1.2 File1 File2 | xargs -Ixx grep xx File1
Explanation:
-Ixx
: replace the string xx
in the command that follows with each of the output lines from the preceding command; the execute that command for each line. This means we will find the lines that match the first ,number,
which should make us insensitive to anything else.