HTML:
<span class="number"> - Sep 15, 1991<br><strong>Some Number: </strong>123, 123, 145</span>
Scrapy:
samples = response.css('ul li.somthing')
for sample in samples:
loader = ItemLoader(item=CatelogItem(), selector=sample)
loader.add_css('some', 'span.number::text')
yield loader.load_item()
Item.py
some = Field(
input_processor=MapCompose(str.strip),
output_processor=Join()
)
Result
- Sep 15, 1991
Expected
- Sep 15, 1991 Some Number: 123, 123, 145
Why is this behavior? how do i get the full value loaded in itemloader?
You needed to grab all the innerhtml instead of text which includes all of it's nested components.
loader.add_css('some', 'span.number *::text')