python xpath web-scraping scrapy scrapy-shell

Scrapy Shell XPath

I am trying to get links and category from this http://www.npr.org/rss/#feeds news feed website.

This is my xpath in scrapy shell:

a = sel.xpath('//ul[@class="rsslinks"]/li/a/@href').extract()

b = sel.xpath('//ul[@class="rsslinks"]/li/a/text()').extract()

But length of b is one lesser than length of a. I don't know what I am missing here. But this is causing problems in data.

From the image below,the category name is "Most Emailed Stories" but link is for "News Headlines"

Any help would be appreciated Xpath Screen

Solution

This is because of the first link in the results:

<a class="iconlink xml" href="/rss/rss.php?id=1001" target="blank"><strong>News Headlines</strong></a>

As you can see, there is no direct child "text" nodes, only one strong element. Your xpath would not match it.

Add an another slash to get all text nodes from the a tag:

//ul[@class="rsslinks"]/li/a//text()
                         HERE^

For Loops in Python (Output Smallest Input)
How to parse a function with ply in Python?
Quantum Circuit not drawing on Colab
Prime factorization using list comprehension in Python
How do I place two or more ASCII images side by side?
Unable to get local issuer certificate when using requests
Get mutual settlements from records using SQL
How to convert a file to utf-8 in Python?
SQLAlchemy join & filter
How to access FastAPI backend from a different machine/IP on the same local network?
Python does not see pygraphviz
Default filter expression to "match anything"
Django Scraper Matching Issue: match_maker Only Returns 4 Members Instead of 150
Flask App works with Curl but not with HTTP request
Adding a combination in a datafra, which is missing. Pandas data frame
How to inherit from Python None
How to make a triangle of x's in python?
Using Yaml Anchors across different files using python / ruamel.yaml
Python: Create strikethrough / strikeout / overstrike string type
Boolean operators: Branching using Boolean variables ( python)
Django is taking a long time to load
How to find the most common frequeny in Time series
Adjust Matplotlib Polar Plot to Show Sub Degree Motion (AKA Stretch a polar plot() slice)
pandas: Convert string column to ordered Category?
Problem scraping table row data into an array
What's win32con module in python? Where can I find it?
Why Am I Seeing Multiple python.exe In Different Locations On a Virtual Machine?
Does python3 asyncio use a work stealing scheduler like Rust Tokio?
What is the best way to Install Conda on MacOS (Apple/Mac)?
Configuration of Django+WSGI+Apache