I have a bash loop which is using a python program to loop over each identifier in a list (text file), to download genomes (files), I'm wondering whether there is a way I can link each file download to the id in the list, as the downloaded files have names which make it more difficult to use later on.
The loop in bash:
for i in $(more 'idpandas.txt'); do echo $i; ncbi-genome-download --format protein-fasta --species-taxid $i archaea,bacteria; done;
Is there anyway this is possible?
For sure there should be a way. But we need more information: which are the names of the files you are downloading?
for i in $(<idpandas.txt)
do
echo $i
ncbi-genome-download --format protein-fasta --species-taxid $i archaea,bacteria
ln -s $DOWNLOAD_NAME $i
done
BTW, don't use "more" in the loop list of elements, it is a pager, it will give you problems in this scenario. Doing the link is as easy as the "ln" line. I bet your problem is knowing which is the filename that is being generated. But that is something we don't even know, either.
Using a dirty way as suggested in one of my comments, you can store the needed files under a folder with your ID. I don't know how your download script works, but I think the following code should do the trick:
for i in $(<idpandas.txt)
do
echo $i
mkdir $i
cd $i
ncbi-genome-download --format protein-fasta --species-taxid $i archaea,bacteria
cd ..
done
This should give you a bunch of folders named like the IDs in the idpandas.txt file, and inside every folder, the downloaded files by your tool.