Search code examples
scrapyscrapyd

Scrapyd: How to retrieve spiders or version of a scrapyd project?


It apears that either the documentation of scrapyd is wrong or that there is a bug. I want to retrieve the list of spiders from a deployed project. the docs tell me to do it this way:

curl http://localhost:6800/listspiders.json?project=myproject

So in my environment it looks like this:

merlin@192-143-0-9 spider2 % curl http://localhost:6800/listspiders.json?project=crawler                                                
zsh: no matches found: http://localhost:6800/listspiders.json?project=crawler

So the command seems not to be recognised. Lets check the project availability:

merlin@192-143-0-9 spider2 % curl http://localhost:6800/listprojects.json                  
{"node_name": "192-143-0-9.ip.airmobile.co.za", "status": "ok", "projects": ["crawler"]}

Looks OK to me.

Checking the docs again, the other API calls take a parameter not as a GET but in a different way:

curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider

Applying this to listspiders:

merlin@192-143-0-9 spider2 % curl http://localhost:6800/listspiders.json -d project=crawler                                                 
{"node_name": "192-143-0-9.ip.airmobile.co.za", "status": "error", "message": "Expected one of [b'HEAD', b'object', b'GET']"}

Missing the GET parameter. So it looks like we are runnning in circles.

How can one retrieve a list of spiders or version (listversion) with scrapyd?


Solution

  • Maybe the url needs to be wrapped in double-quotes, Try

    curl "http://localhost:6800/listspiders.json?project=crawler"