I have a file with the following format :
TRINITY_DN119001_c0_g1_i1 4 * 0 0 * * 0 0 GAGCCTCCCTCATGAATGTACCAGCATTTACCTCATAAAGAGCT * XO:Z:NM
TRINITY_DN119037_c0_g1_i1 4 * 0 0 * * 0 0 TAAGATTAGGTTGTATTCCAG * XO:Z:NM
TRINITY_DN119099_c0_g1_i1 4 * 0 0 * * 0 0 AGGCAGGCGCTAAACGATTTGCATTTCTCTAATGATTACGCCAG * XO:Z:NM
I am trying to extract the 1st and 10th column and store it in the following format(output file) :
>TRINITY_DN119099_c0_g1_i1
GAGCCTCCCTCATGAATGTACCAGCATTTACCTCATAAAGAGCT
>TRINITY_DN119037_c0_g1_i1
TAAGATTAGGTTGTATTCCAG
>TRINITY_DN119001_c0_g1_i1
AGGCAGGCGCTAAACGATTTGCATTTCTCTAATGATTACGCCAG
I am doing the following code for now :
cut -d " " -f1,10 in.txt > out.txt
sed 's/^/>/' out.txt
but,unable to get how to get above output.
You may use awk
:
awk '{printf ">%s\n%s\n", $1, $10}' file
>TRINITY_DN119001_c0_g1_i1
GAGCCTCCCTCATGAATGTACCAGCATTTACCTCATAAAGAGCT
>TRINITY_DN119037_c0_g1_i1
TAAGATTAGGTTGTATTCCAG
>TRINITY_DN119099_c0_g1_i1
AGGCAGGCGCTAAACGATTTGCATTTCTCTAATGATTACGCCAG
However note that it is 1st and 10th column in your shown output instead of 9th.