I would like to use BiG-SCAPE (https://git.wageningenur.nl/medema-group/BiG-SCAPE/-/wikis/home) in Google Colab. How can I set it up and run an example?
You must first have Anaconda installed.
The following commands will create a virtual environment and install BiG-SCAPE into it:
%%shell
eval "$(conda shell.bash hook)" # copy conda command to shell
# Create virtual environment for BiG-SCAPE, then install dependencies, BiG-SCAPE, and databases into it (this will take a while)
conda create --prefix /usr/local/envs/bigscape python==3.6 -y
conda install --name bigscape hmmer biopython mafft fasttree networkx numpy scipy scikit-learn=0.19.1 -y
conda activate bigscape
# Clone BiG-SCAPE from Git and install in virtual environment
cd /usr/local/envs/bigscape
git clone https://git.wur.nl/medema-group/BiG-SCAPE.git
# Download Pfam database and check that everything was installed properly
cd BiG-SCAPE
mkdir -p databases
cd databases
wget ftp://ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam32.0/Pfam-A.hmm.gz && gunzip Pfam-A.hmm.gz
hmmpress Pfam-A.hmm
# Check that everything was installed correctly
cd ..
python bigscape.py --version
conda deactivate
The following commands will download an example dataset and run BiG-SCAPE on it:
%%shell
eval "$(conda shell.bash hook)" # copy conda command to shell
# Download example dataset
cd /usr/local/envs/bigscape/BiG-SCAPE
mkdir -p demo
cd demo
wget https://raw.githubusercontent.com/nselem/bigscape-corason/master/scripts/data_bigscape_corason.sh
chmod a+x data_bigscape_corason.sh
bash data_bigscape_corason.sh -b
# Run BiG-SCAPE on example dataset
conda activate bigscape
python /usr/local/envs/bigscape/BiG-SCAPE/bigscape.py \
--inputdir /usr/local/envs/bigscape/BiG-SCAPE/demo/gbks \
--outputdir /gdrive/My\ Drive/Github/cluster_identification/demo/output/BiG-SCAPE \
--pfam_dir /usr/local/envs/bigscape/BiG-SCAPE/databases
conda deactivate