I am attempting to extract the socket type of the cpu as you can see in the following image. I have identified that the socket type is under the <h4>
Socket heading as seen in the following image.
So far I have been able to scrape .spec.block
and find all <h4>'s
nested inside. However I can't get the text under each heading
Here is my code
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://au.pcpartpicker.com/product/' + jLF48d)
about = r.html.find('.specs.block')[0]
about = about.find('h4')
print(about.text)
This prints
[ <Element 'h4' >, <Element 'h4' >, <Element 'h4' >, <Element 'h4' >,
<Element 'h4' >, <Element 'h4' >, <Element 'h4' >, <Element 'h4' >,
<Element 'h4' >, <Element 'h4' >, <Element 'h4' >]
However when I change the print statement to:
print(about.text)
I get the following error:
AttributeError: 'list' object has no attribute 'text'
Update:
print(about[0].text)
This code prints:
Manufacturer AMD Which is the first heading and text however I need the 4th
Any idea what code I can use to reach the desired result?
If you require any more information please let me know.
Replacing: print(about[0].text)
With
print(about[3].text)
As seen on the code in my question above solved the problem for me!