I have a large fastq file and I want to add the sequence "TTAAGG" to the end of each sequence in my file (the 2nd line then every 4th line after), while still maintaining the fastq file format. For example: this is the first line I start with:
@HWI-D00449:41:C2H8BACXX:5:1101:1219:2053 1:N:0:
GCAATATCCTTCAACTA
+
FFFHFHGFHAGGIIIII
and I want it to print out:
@HWI-D00449:41:C2H8BACXX:5:1101:1219:2053 1:N:0:
GCAATATCCTTCAACTATTAAGG
+
FFFHFHGFHAGGIIIII
I imagine sed or awk would be good for this, but I haven't been able to find a solution that allows me to keep the fastq format.
I tried:
awk 'NR%4==2 { print $0 "TTAAGG"}' < file_in.fastq > fileout_fastq
which added the TTAAGG to the second line and then every fourth line, but it also deleted the other three lines.
Does anyone have an suggestions of command lines I can use or if you know of a package currently available that can do this, please let me know!
Try this with GNU sed:
sed '2~4s/$/TTAAGG/' file