My input file has following entries:
0100000000010001000 1 GWSL7YE02GHT73,
0010000000000000000 1 GWSL7YE02GU6GK,
0000000000000000000 1 GWSL7YE02G5W2B,
0010000000110000000 1 GWSL7YE02I364F,
0000000000000000000 1 GWSL7YE02F4IOC, Escherichia_coli_O127:H6
How can I only capture line that have string at the end, such as line 5.
Another thing to note is that in each line at the there are two escape sequences "\t" "\n".
So in lines 1-5 do not think that after"," there is escape char "\n", BUT in reality it is ,"\t""\n".
I did had have following awk code:awk '{if ($0~/[A-Z0-9_]$/) print$NF}'
, However, this assumes that there are either alphabets,number of undescore at the end. In reality the names can end with any special characters.I have tested hence I had to put an underscore"_". So is there a way other then this. Can I have something as awk '{if ($NF!~/an expression that maps ,\n\t/}'
Thanks
Just look for lines that have fields greater than 3
awk 'NF>3' ./infile
$ cat -A lastfield
0100000000010001000 1 GWSL7YE02GHT73,^I$
0010000000000000000 1 GWSL7YE02GU6GK,^I$
0000000000000000000 1 GWSL7YE02G5W2B,^I$
0010000000110000000 1 GWSL7YE02I364F,^I$
0000000000000000000 1 GWSL7YE02F4IOC,^IEscherichia_coli_O127:H6^I$
$ awk 'NF>3' lastfield
0000000000000000000 1 GWSL7YE02F4IOC, Escherichia_coli_O127:H6