I want to use the snakemake utility (5.6.0) to use files stored on the EGA. First I wanted to try the code written in the official documentation, so I tried this:
import snakemake.remote.EGA as EGA
ega = EGA.RemoteProvider()
rule get_remote_file_ega:
input:
ega.remote("ega/dataset_id/foo.bam")
output:
"data/foo.bam"
shell:
"cp {input} {output}"
Before executing the script I created environment variables as necessary (EGA_USERNAME and EGA_PASSWORD).
Then I get the following error:
me:~/scripts$ snakemake -s test_ega.smk
Building DAG of jobs...
Traceback (most recent call last):
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/__init__.py", line 551, in snakemake
export_cwl=export_cwl)
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/workflow.py", line 433, in execute
dag.init()
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/dag.py", line 122, in init
job = self.update([job], progress=progress)
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/dag.py", line 603, in update
progress=progress)
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/dag.py", line 655, in update_
missing_input = job.missing_input
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/jobs.py", line 396, in missing_input
for f in self.input
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/jobs.py", line 397, in <genexpr>
if not f.exists and not f in self.subworkflow_input)
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/io.py", line 208, in exists
return self.exists_remote
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/io.py", line 119, in wrapper
v = func(self, *args, **kwargs)
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/io.py", line 258, in exists_remote
return self.remote_object.exists()
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/remote/EGA.py", line 173, in exists
return self.parts.path in self.provider.get_files(self.parts.dataset)
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/remote/EGA.py", line 126, in get_files
"data/metadata/datasets/{dataset}/files".format(dataset=dataset))
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/remote/EGA.py", line 96, in api_request
headers["Authorization"] = "Bearer {}".format(self.token)
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/remote/EGA.py", line 77, in token
self._login()
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/remote/EGA.py", line 45, in _login
"client_id" : self._client_id(),
File "/home/puissant/miniconda3/lib/python3.7/site-packages/snakemake/remote/EGA.py", line 151, in _client_id
return self._credentials("EGA_CLIENT_ID")
NameError: name 'self' is not defined
The part of the code involved is there (EGA.py line 151):
1 @classmethod
2 def _client_id(cls):
3 return self._credentials("EGA_CLIENT_ID")
Could the error come from a "self" instead of a "cls" on line 3? Because after changing it to "cls" the error moved to the next block, built in the same way. My understanding of python objects being limited, I hope I don't say great absurdities.
Have I forgotten any steps or misunderstood any of them?
You are correct, you should be using cls
(which presumably stands for 'class' here) rather than self
. self
is generally the name used for instances of classes i.e. objects. If you are using self
elsewhere in the function, you need to switch them to cls
.