Search code examples
awktrimcutfastq

Hot to trim every nth line by a different value?


I would like to trim the last XY characters of every 4th line. The cut off should be the different between the character count from line 4 and 2, and line 8 and 6.

For example: line 4 (29 characters) - line 2 (20 characters) = 9. So the last 9 characters of line 4 should be removed.

Input:

@V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FFFGFGGFGFGFFGFFGFFGGGGGFFFGG
@V300059044L3C001R0010009240
AAAGGGAGGGAGAATAAT
+
GFFGFEGFGFGEFDFGGEFFGGEDEGEGF

Output:

@V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FFFGFGGFGFGFFGFFGFFG
@V300059044L3C001R0010009240
AAAGGGAGGGAGAATAAT
+
GFFGFEGFGFGEFDFGGE

Solution

  • Running

    awk 'NR%4==0 {$0=substr($0,1,a)} NR%2==0 {a=length($0)}  {print $0}' input.txt
    

    on input.txt yields

    @V300059044L3C001R0010004402
    AAGTAGATATCATGGAGCCG
    +
    FFFGFGGFGFGFFGFFGFFG
    @V300059044L3C001R0010009240
    AAAGGGAGGGAGAATAAT
    +
    GFFGFEGFGFGEFDFGGE