I need to access the custom settings passed from the CLI using:
-s SETTING_NAME="SETTING_VAL"
from the __init__() method of the spider class.
get_project_settings()
allows me to access only the static settings.
The docs explain how you can access those custom settings by from a pipeline setting up a new pipeline through:
@classmethod
def from_crawler(cls, crawler):
settings = crawler.settings
But is there any way to access them from the __init__()
spider method?
Just use settings.get
e.g.
print(self.settings.get('SETTING_NAME'))
will print
SETTING_VAL
If you want to access a setting in your spider __init__
you have a couple of options. If you command-line options is just a spider
argument, use -a
instead of -s
. If for some reason you need to access an actual setting in your spider __init__
then you have to override the from_crawler
classmethod
as described in the docs.
Here is an example:
import scrapy
class ArgsSpider(scrapy.Spider):
name = "my_spider"
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
print('kwargs =', kwargs)
@classmethod
def from_crawler(cls, crawler, *args, **kwargs):
spider = cls(
*args,
my_setting=crawler.settings.get("MY_SETTING"),
**kwargs
)
spider._set_crawler(crawler)
return spider
run with e.g. scrapy runspider args_spider.py -s MY_SETTING=hello,world!
and you will see your setting in the kwargs
dict. You can of course get other settings this way too