Search code examples
pythonpython-2.7screen-scrapingscrapy

Error in scrapy screen scraper - Cant find what's wrong for the life of me


I can't figure out what is causing this error. The error is happening on line 3 of the craig.py file but I don't see any discrepancy.

folder Structure

  • Craig (folder)
    • Spiders (folder)
      • init.py
      • init.pyc
      • craig.py
      • craig.pyc
    • init.py
    • init.pyc
    • pipelines.py
    • settings.py
    • settings.pyc
    • scrapy.cfg

Project name: Craig File name: Craig Spyder name: Craig.py

Craig.py

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from craig.items import CraigslistSampleItem

class MySpider (BaseSpider):
    name = "craig"
    allowed_domain = ["craigslist.org"]
    start_urls = ["http://sfbay.craigslist.org/sfc/npo/"]

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        title = hxs.select("//p")
        items = []
        for titles in titles:
            item = CraigslistSampleItem()
            item ["title"] = titles.select("a/text()").extract()
            item ["link"] = titles.select("a/@href").extract()
            items.append(item)
        return items

items.py

# -*- coding: utf-8 -*-

# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html

from scrapy.item import Item, Field


class CraigslistSampleItem(Item):
    title = Field()
    link = Field()

Here's the error:

Traceback (most recent call last):
  File "C:\Python27\Scripts\scrapy-script.py", line 9, in <module>
    load_entry_point('scrapy==0.24.4', 'console_scripts', 'scrapy')()
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 89, in _run_print_help
    func(*a, **kw)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 150, in _run_command
    cmd.run(args, opts)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\commands\cr
awl.py", line 57, in run
    crawler = self.crawler_process.create_crawler()
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\crawler.py"
, line 87, in create_crawler
    self.crawlers[name] = Crawler(self.settings)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\crawler.py"
, line 25, in __init__
    self.spiders = spman_cls.from_crawler(self)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 35, in from_crawler
    sm = cls.from_settings(crawler.settings)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 31, in from_settings
    return cls(settings.getlist('SPIDER_MODULES'))
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 22, in __init__
    for module in walk_modules(name):
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\utils\misc.
py", line 68, in walk_modules
    submod = import_module(fullpath)
  File "C:\Python27\lib\importlib\__init__.py", line 37, in import_module
    __import__(name)
  File "C:\Users\Turbo\craig\craig\spiders\craig.py", line 3, in <module>
    from craig.items import CraigslistSampleItem
ImportError: No module named items

Solution

  • Please, show position of items.py in project structure.

    You should have smth like this:

    • craig (folder)
      • craig.py
      • items (folder)
        • __ init __.py
        • items.py