Search code examples
pythonxpathlxmlscreen-scraping

Want to extract links and titles from a certain website with lxml and python but cant


I am Yasa James, 14, and I am new to web scraping. I am trying to extract titles and links from this website. As an so called "Utako" and a want-to-be programmer, I want to create a program that extract links and titles at the same time. I am currently using lxml because I cant download selenium, limited internet, very slow internet because Im from a province in philippines and I think its faster than other modules that I've used.

here's my code:

from lxml import html
import requests

url = 'https://animixplay.to/dr.%20stone'
page = requests.get(url)
doc = html.fromstring(page.content)

anime = doc.xpath('//*[@id="result1"]/ul/li[1]/p[1]/a/text()')

print(anime)

One thing I've noticed is that is I want to grab the value of an element from any of the divs, is it gives out an empty list as an output.

I hope you can help me with this my Seniors. Thank You!

Update: i used requests-html to fix my problem and now its working, Thank you!


Solution

  • The reason this does not work is that the site you're trying to fetch uses JavaScript to generate the results, which means Selenium is your only option if you want to scrape the HTML. Any static fetching and processing libraries like lxml and beautifulsoup simply do not have the ability to parse the result of JavaScript calls.