Search code examples
pythonpython-3.xeggscrapyd

Scapyd raise NotADirectoryError from .egg file


I use Scrapyd for run my spider dynamically. I add .txt file that has a list of block words. My problem is following: When I run Scrapyd server as a daemon it raised the error during scrapping:

NotADirectoryError: [Errno 20] Not a directory: '/tmp/exa-1504173770-gm023ynt.egg/exa/classificator/large.txt'

But if I run Scrapyd server from project dir, all works fine Here setup.py code

from setuptools import setup, find_packages

    setup(
        name='project',
        version='1.0',
        packages=find_packages() + ['exa'],
        entry_points={'scrapy': ['settings = exa.settings']},
        package_dir={'exa': 'exa'},
        package_data={'exa': ['classificator/large.txt']}
    )

And here I'm loading file:

file_dict = open(file_name_dictionary, "r")
self.correct_words = set()
for word in file_dict:
    self.correct_words.add(word[:-1])

UPDATE: I fixed this issues, need to use pkg_resources.resource_stream(resource_package, resource_path) for loading file from .egg file


Solution

  • I fixed this issues, need to use pkg_resources.resource_stream(resource_package, resource_path) for loading file from .egg file