Im using Scrapy and I want to save some of the .svg images from the webpage locally on my computer. The urls for these images have the structure '__.com/svg/4/8/3/1425.svg' (and is a full working url, https included).
Ive defined the item in my items.py file:
class ImageItem(scrapy.Item):
image_urls = scrapy.Field()
images = scrapy.Field()
Ive added the following to my settings:
ITEM_PIPELINES = {
'scrapy.pipelines.images.ImagesPipeline': 1,
}
IMAGES_STORE = '../Data/Silks'
MEDIA_ALLOW_REDIRECTS = True
In the main parse function im calling:
imageItem = ImageItem()
imageItem['image_urls'] = [url]
yield imageItem
But it doesn't save the images. Ive followed the documentation and tried numerous things but keep getting the following error:
StopIteration: <200 https://www.________.com/svg/4/8/3/1425.svg>
During handling of the above exception, another exception occurred:
......
......
PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x1139233b0>
Am I missing something? Can anyone help? I am fully stumped.
Gallaecio was right! Scrapy was having an issue with the .svg file type. Changed the imagePipeline to the filePipeline and it works!
For anyone stuck the documentation is here