I have a blast file produced. I executed a blast(x)
command outputting both "qeseqid"
and "sseqid"
:
QRv313_NP342_d0_h2_l9 YN13213
QRv313_NP9080_d0_h1_l1 YN5345
QRv313_NP123_d0_h1_l7 YN756
QRv313_NP123_d0_h1_l113 YN9768
QRv313_NP654_d0_h2_l6 YN432
QRv313_NP8_d0_h1_l1 YN3242
QRv313_NP756_d0_h1_l2 YN85686
I have written a command in nano
within command-line
to obtain the following desired output:
NP342 YN13213
NP9080 YN5345
NP123 YN756
NP123 YN9768
NP654 YN432
NP8_d0 YN3242
NP756 YN85686
I have written a nano
script to provide me a tab delimited column of my query and subject id. I am just having trouble moving forward from here. I am unsure as to how I would modify my script to provide me with my desired output.
import sys
file_object = open(sys.argv[1])
for my_data in file_object:
list = my_data.split("\t")
print (list [0], list [1])
Is there a way to alter my command so I can receive the desired output?
Any suggestions would be kindly appreciated!
You can try:
import sys
with open(sys.argv[1]) as file_object:
for my_data in file_object:
a_list = my_data.split('\t')
print(a_list[0].split('_')[1], a_list[1], sep='\t', end='')
list
is a built-in type (do not use it as a name). The above code split
s your data on \t
and then the first field on _
. It then prints the desired data delimited by \t
(end=''
is included to avoid printing a second newline
).