I have a file with this specific format:
T 11722 A 330:0:0:0:0:0 315:0:0:0:0:0
T 11723 B 0:330:0:0:0:0 0:316:0:0:0:0
T 11725 C 0:327:0:0:0:0 0:314:0:0:0:0
T 11726 D 330:0:0:0:0:0 314:0:0:0:0:0
T 11727 E 0:6:0:323:0:0 0:6:0:309:0:0
T 11728 F 0:0:0:328:0:0 0:1:0:314:0:0
T 11729 G 0:325:0:0:0:0 0:315:0:0:0:0
I would like to remove any lines that don't have two values in columns 4 and 5.
For instance, if a line has the specific format:
T 11722 A 330:0:0:0:0:0 315:0:0:0:0:0
remove it.
If it has the following format (two values per column in columns 4 and 5):
T 11727 E 0:6:0:323:0:0 0:6:0:309:0:0
Keep it.
Thus, the expected result should be:
T 11727 E 0:6:0:323:0:0 0:6:0:309:0:0
T 11728 F 0:0:0:328:0:0 0:1:0:314:0:0
I have no idea how to set up something under unix but I am guessing there should be an easy way around. Any help would be greatly appreciated.
Many thanks
Are you just trying to print lines where there's 2 or more non-zero values in $4 or $5? That'd be:
$ awk 'gsub(/[1-9][0-9]*/,"&",$4)>1 || gsub(/[1-9][0-9]*/,"&",$5)>1' file
T 11727 E 0:6:0:323:0:0 0:6:0:309:0:0
T 11728 F 0:0:0:328:0:0 0:1:0:314:0:0