Search code examples
python-2.7screen-scrapingscrapy

Does scrapy item exporter support priority? If yes how?


By supporting priority I mean when you pop an item out of item pipeline it returns an item with highest priority.


Solution

  • Maybe you can customize it yourself.

    pipelines.py

    class PriorityPipeline(object):
        def __init__(self):
            self.ids_seen = set()
        def process_item(self, item, spider):
            if item['id'] in self.ids_seen:
                raise DropItem("Duplicate item found: %s" % item)
            else:
                self.ids_seen.add(item['id'])
                return item
    

    settings.py

    ITEM_PIPELINES = [
        'soufun.pipelines.PriorityPipeline',
    ]