The problem I had is I can't upload my .egg
file to scrapyd using
curl http://127.0.0.1:6800/addversion.json -F project=scraper_app -F version=r1 egg=@scraper_app-0.0.1-py3.8.egg
its returning an error message like this
{"node_name": "Workspace", "status": "error", "message": "b'egg'"}
So I'm using Django
and Scrapy
in the same project, and I had this folder structure
my_app/
-- apps/ # django apps folder
-- crawler/
-- __init__.py
-- admin.py
-- apps.py
-- etc..
-- pages/
-- __init__.py
-- admin.py
-- apps.py
-- etc..
-- my_app/ # django project folder
-- __init__.py
-- asgi.py
-- settings.py
-- etc..
-- scraper_app/ # scrapy dir
-- scraper_app/ # scrapy project folder
-- spiders/
-- abc_spider.py
-- __init__.py
-- middlewares.py
-- pipelines.py
-- settings.py
-- etc..
-- scrapy.cfg
-- manage.py
-- scrapyd.conf
-- setup.py # setuptools for creating the egg file
-- etc..
and here is what my setup.py
looks like
from setuptools import setup, find_packages
setup(
name="scraper_app",
version="1.0.0",
author="Khrisna Gunanasurya",
author_email="contact@khrisnagunanasurya.com",
description="Create egg file from 'scraper_app'",
packages=find_packages(where=['scraper_app'])
)
my scrapyd.conf
file
[scrapyd]
eggs_dir = eggs
logs_dir = logs
logs_to_keep = 5
dbs_dir = dbs
max_proc = 0
max_proc_per_cpu = 4
http_port = 6800
debug = off
runner = scrapyd.runner
application = scrapyd.app.application
and my scrapy.cfg
content
[settings]
default = scraper_app.settings
[deploy]
url = http://127.0.0.1:6800/
project = scraper_app
So what I want is add an .egg
file to my scrapyd/addversion.json
and here is my step by step to achieve what I want:
py setup.py bdist_egg
.egg
file being generated in dist/
folder and its called scraper_app-0.0.1-py3.8.egg
dist/
foldercurl http://127.0.0.1:6800/addversion.json -F project=scraper_app -F version=r1 -F egg=@scraper_app-0.0.1-py3.8.egg
and then what I got is an error message, if I tried to run the curl
from the root dirs, and run something like this curl http://127.0.0.1:6800/addversion.json -F project=scraper_app -F version=r1 -F egg=@dist\scraper_app-0.0.1-py3.8.egg
(im using windows) it'll returning this error
curl: (6) Could not resolve host: dist\scraper_app-0.0.1-py3.8.egg
I already tried to googled it but I can't find how to solved this or what wrong step I make here, and I already tried to create the .egg
file from the scraper_app
dir directly, so just create an egg file from the scraper_app
project folder, but its not working as well.
Can someone tell me whats wrong with my project? or what I do wrong in here?
thank you
after I googled it more, and tried the scrapyd-client
but there are lots of problem with windows, it doesnt easy to use the scrapyd-deploy
, but I found a video on youtube that show me what is the correct way to install the scrapyd-client
.
so here is the correct way to install it.
Make sure you inside a
virtualenv
, and then install thescrapyd-client
withpip install git+https://github.com/scrapy/scrapyd.git
. So it doesnt show any error or any difficulties to install it
and then you can just run scrapyd-deploy
on the scrapy project folder.