Search code examples
bashawksedcut

How to delete a line if the letter after nth space matches a criteria?


I have a large text file with the data looking something like below:

ATOM 3515 CA GLY C 43 1.094 36.439 24.619 1.00 44.14 C

How would I delete a line, depending on whether the Character after the 4th space is a 'C'? I've seen a way to do it with cut like

cut -f1,4 -d' ' 

That removes everything from beyond the match.

Is there a way to do that for the whole line? I can see a few ways to do it in other ways, but I'd like to do it specifically off the 4th space, that way I can be definite (or more so) it's not going to parse out the wrong bits somewhere in the depths of the file.


Solution

  • How would I delete a line, depending on whether the Character after the 4th space is a 'C'

    You can use awk for this:

    awk '$5 !~ /^C/' file
    
    • $5 is field # 5, i.e. field after 4th space
    • /^C/ is a check to assert that 5th field starts with C

    You can get same output using substr function as well:

    awk 'substr($5, 1, 1) != "C"' file