I have a big data set and would like to check if every third line has the desired number of bases.
example:
line 1
line 2
ATTGAC
line 4
line 5
TTCGGATC
line 7
line 8
GGTCAA
So line 6 contains 8 bases instead of 6. I would like my script to stop if this is the case.
Sounds like a job for awk:
awk 'NR % 3 == 0 && length($0) != 6 { print "line " NR " is the wrong length"; exit }' file
When the record number NR
is a multiple of 3 and the length of the line isn't 6, print the message and exit.
Output from your example (assuming that all those blank lines aren't supposed to be there):
$ awk 'NR % 3 == 0 && length($0) != 6 { print "line " NR " is the wrong length"; exit }' file
line 6 is the wrong length