Search code examples
pythonjupyter-notebookblastncbi

NCBI blast+ from commandline not recognizing the blastn command


I've been trying to generate HCR probes using the script provided here, which should be a pretty straithforward, user friendly method, and I'd like to get this ready for the whole lab to use. I have some experience in python, but never used NCBI blast+ before.

I'm running into an issue when trying to blast the probes generated.. For some reason it errors on the 'cline()' command. I've installed (pip) and imported the cline package, but it is no use.. Any idea's where the problem lies?

I haven't had to specify the location of NCBI Blast+ executable, which I've read on other pages might solve it, but I wouldn't know how to integrate it in such a complex code.. it is currently installed on my C:/ drive in program files..

Any suggestions are more than welcome !

I tried pip installing cline (which isn't requested), and providing the path to the executable as follows;

blastn = r"C:\Program Files\NCBI\blast-BLAST_VERSION+\bin\blastn.exe"

however it is not my own code and it's quite complex so I don't know the correct way to integrate it. I've read a similar issues like this, where it was fixed after specifying the blastn.exe path but I've not been able to do this.

FYI I'm on windows, using jupyter notebook via anaconda

ApplicationError                          Traceback (most recent call last)
Cell In[1], line 16
     14 strt = start()
     15 name,fullseq,amplifier,pause,choose,polyAT,polyCG,BlastProbes,db,dropout,show,report,maxprobe,numbr = strt[0],strt[1],strt[2],strt[3],strt[4],strt[5],strt[6],strt[7],strt[8],strt[9],strt[10],strt[11],strt[12],strt[13]
---> 16 maker(name,fullseq,amplifier,pause,choose,polyAT,polyCG,BlastProbes,db,dropout,show,report,maxprobe,numbr)

File c:\Users\wilke\OneDrive - Hubrecht Institute\Jupyter notebooks\insitu_probe_generator-v.0.3.2\maker37cb.py:411, in maker(name, fullseq, amplifier, pause, choose, polyAT, polyCG, BlastProbes, db, dropout, show, report, maxprobe, numbr)
    408 ## Probe BLAST setup and execution from FASTA file prepared in previous step
    410     cline = bn(query = str(name)+"PrelimProbes.fa", subject = db, outfmt = 6, task = 'blastn-short') #this uses biopython's blastn formatting function and creates a commandline compatible command 
--> 411     stdout, stderr = cline() #cline() calls the string as a command and passes it to the command line, outputting the blast results to one variable and errors to the other
    413     ## From results of blast creating a numpy array (and Pandas database)
    414     dt = [(np.unicode_,8),(np.unicode_,40),(np.int32),(np.int32),(np.int32),(np.int32),(np.int32),(np.int32),(np.int32),(np.int32),(np.float),(np.float)]

File ~\anaconda3\envs\hcr\lib\site-packages\Bio\Application\__init__.py:574, in AbstractCommandline.__call__(self, stdin, stdout, stderr, cwd, env)
    571     stderr_arg.close()
    573 if return_code:
--> 574     raise ApplicationError(return_code, str(self), stdout_str, stderr_str)
    575 return stdout_str, stderr_str

ApplicationError: Non-zero return code 1 from 'blastn -outfmt 6 -query tbx18PrelimProbes.fa -subject "C:\\Users\\wilke\\OneDrive - Hubrecht Institute\\Jupyter notebooks\\insitu_probe_generator-v.0.3.2\\fastas\\Tbx18-cDNA.fa" -task blastn-short', message "'blastn' is not recognized as an internal or external command,"

screenshot of the error

current code :

from start import start
from maker37cb import maker
import pandas as pd
from Bio.Seq import Seq
from Bio.Blast.Applications import NcbiblastnCommandline
import io
import numpy as np
import pandas as pd
import cline

blastn = r"C:\Program Files\NCBI\blast-BLAST_VERSION+\bin\blastn.exe"

    
strt = start()
name,fullseq,amplifier,pause,choose,polyAT,polyCG,BlastProbes,db,dropout,show,report,maxprobe,numbr = strt[0],strt[1],strt[2],strt[3],strt[4],strt[5],strt[6],strt[7],strt[8],strt[9],strt[10],strt[11],strt[12],strt[13]
maker(name,fullseq,amplifier,pause,choose,polyAT,polyCG,BlastProbes,db,dropout,show,report,maxprobe,numbr)

update :

I tried modifying the source code to include the path to the executable as described the above mentioned biostars post .. no luck still, now the path is not recognized.. still get an error


Solution

  • Turns out, running the script from my OneDrive created this issue.. A fresh install of the github script to my C:/ drive ran just fine straight from the start.

    Leaving it up in case anyone else might run into a similar issue. But amplifying @Wayne's tip in the comments, don't run things from OneDrive.. :)

    Spaces and unusual things in paths are asking for trouble when working, and I've seen where things aren't actually where you really think they are dealing with OneDrive. It is a common cause of issues on the Jupyter Community Discourse Forum. However, it makes no sense that if you copy the file and place it next to the notebook, that it cannot see it. Can you list the files that are alongside the notebook by running ls or dir in the notebook? Also run pwd to see if the current working directory is really where you suspect it is. If it's not, you use %cd to change it. – Wayne