Search code examples
python-2.7scrapyminiconda

ImportError: No module named [name].items


I am attempting to run my own scrapy project. I thought I resolved a related issue in a thread I posted here:[urlparse: ModuleNotFoundError, presumably in Python2.7 and under conda

I did a complete system image restore and simply installed Python 2.7 and Miniconda. However, Atom Editor is still flagging/underlining 'import urlparse'.

The code is based off a well written book and the author provides a great VM playground to run scripts exampled in the book. In the VM the code works fine.

However, in an attempt to practice on my own, I now receive the following error:

(p2env) C:\Users\User-1\Desktop\scrapy_projects\dictionary>scrapy crawl basic Traceback (most recent call last): File "C:\Users\User-1\Miniconda2\envs\p2env\Scripts\scrapy-script.py", line 5, in <module> sys.exit(scrapy.cmdline.execute()) File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\cmdline.py", line 148, in execute cmd.crawler_process = CrawlerProcess(settings) File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\crawler.py", line 243, in __init__ super(CrawlerProcess, self).__init__(settings) File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\crawler.py", line 134, in __init__ self.spider_loader = _get_spider_loader(settings) File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\crawler.py", line 330, in _get_spider_loader return loader_cls.from_settings(settings.frozencopy()) File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\spiderloader.py", line 61, in from_settings return cls(settings) File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\spiderloader.py", line 25, in __init__ self._load_all_spiders() File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\spiderloader.py", line 47, in _load_all_spiders for module in walk_modules(name): File "C:\Users\User-1\Miniconda2\envs\p2env\lib\site-packages\scrapy\utils\misc.py", line 71, in walk_modules submod = import_module(fullpath) File "C:\Users\User-1\Miniconda2\envs\p2env\lib\importlib\__init__.py", line 37, in import_module __import__(name) File "C:\Users\User-1\Desktop\scrapy_projects\dictionary\dictionary\spiders\basic.py", line 11, in <module> from terms.items import TermsItem ImportError: No module named terms.items

My folder hierarchy is as follows:

└───dictionary
│   scrapy.cfg
│
└───dictionary
    │   items.py
    │   middlewares.py
    │   pipelines.py
    │   settings.py
    │   settings.pyc
    │   __init__.py
    │   __init__.pyc
    │
    └───spiders
            basic.py
            basic.pyc
            __init__.py
            __init__.pyc

My items.py code is as follows:

# -*- coding: utf-8 -*-

from scrapy.item import Item, Field


class TermsItem(Item):
    # Primary fields
    title = Field()
    definition = Field()
    # Housekeeping fields
    url = Field()
    project = Field()
    spider = Field()
    server = Field()
    date = Field()

My spider.py is as follows:

# -*- coding: utf-8 -*-

import datetime
import urlparse
import socket
import scrapy

from scrapy.loader.processors import MapCompose, Join
from scrapy.loader import ItemLoader

from terms.items import TermsItem


class BasicSpider(scrapy.Spider):
    name = "basic"
    allowed_domains = ["web"]

    # Start on a property page
    start_urls = (
        'http://dictionary.com/browse/there',
    )

    def parse(self, response):
        # Create the loader using the response
        l = ItemLoader(item=TermsItem(), response=response)

        # Load fields using XPath expressions
        l.add_xpath('title', '//h1[@class="head-entry"][1] / text()',
                    MapCompose(unicode.strip, unicode.title))
        l.add_xpath('definition', '//*[@class="def-list"][1]/text()',
                    MapCompose(unicode.strip, unicode.title))

        # Housekeeping fields
        l.add_value('url', response.url)
        l.add_value('project', self.settings.get('BOT_NAME'))
        l.add_value('spider', self.name)
        l.add_value('server', socket.gethostname())
        l.add_value('date', datetime.datetime.now())

        return l.load_item()

Atom Editor still flags 'import urlparse' and 'from scrapy.loader.processors import MapCompose, Join'

Based on this stackoverflow question: Scrapy ImportError: No module named Item where coders are instructed to '**execute the Scrapy command from inside the top level directory of your project. – alecxe'** has me wondering if the conda environment I am using is causing the error? No module named items stack question has a similar point '**What is doing the import? And what is the working directory/the contents of sys.path. You can't find Project_L if the parent directory isn't the working directory and doesn't appear in sys.path. – ShadowRanger May 11 at 22:24'** However, to the best of my knowledge I am structuring the project correctly and the corresponding hierarchy is correct.

Any help would be greatly appreciated. Apologies for the lengthy post, I just wanted to be as comprehensive as possible and make sure that people appreciate the difference between this question and the similar ones I have linked to.

Regards,


Solution

  • The backtrace indicates the problem is in a specific import. You can verify this from the command line. For example on my machine I get

    $ python -c 'from terms.items import TermsItem'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ImportError: No module named terms.items
    

    Looking at your folder hierarchy I see no called "terms" module so that's probably what you're missing but since you indicate the code is working in the author's VM what I would do is try running the following command in that VM:

    $ python -v -c 'from terms.items import TermsItem'
    

    The -v option will cause python to show you all the paths being imported.
    e.g.

    $ python -v -c 'from terms.items import TermsItem'
    # installing zipimport hook
    import zipimport # builtin
    # installed zipimport hook
    # /usr/local/var/pyenv/versions/2.7.12/lib/python2.7/site.pyc matches /usr/local/var/pyenv/versions/2.7.12/lib/python2.7/site.py
    import site # precompiled from /usr/local/var/pyenv/versions/2.7.12/lib/python2.7/site.pyc
    ...
    import encodings.ascii # precompiled from /usr/local/var/pyenv/versions/2.7.12/lib/python2.7/encodings/ascii.pyc
    Python 2.7.12 (default, Nov 29 2016, 14:57:54) 
    [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ImportError: No module named terms.items
    # clear __builtin__._
    # clear sys.path
    ...
    # cleanup ints: 20 unfreed ints
    # cleanup floats
    

    If you do that where the code is working then somewhere in that output will be a successful import. From that you may be able to work out the name of the missing module on your system and install it accordingly.

    EDIT: looking closer at your post I notice you mention your "items.py" contains

    class TermsItem(Item):
        # Primary fields
        ...
    

    so I suspect your problem is that your import should be

    from items import TermsItem