Im learning xpath and trying to get the value of node with specific node attribute for example(google playstore) using python lxml/html. From below code I wanted to get the developer email value from node "a" with attribute "href" starting with "mailto:". My python code snippet returns app name but empty developer email. Thank you
<html>
<div class="id-app-title" tabindex="0">Candy Crush Saga</div>
<div class="meta-info meta-info-wide">
<div class="title"> Developer </div>
<a class="dev-link" href="https://www.google.com/url?q=http://candycrush.com" rel="nofollow" target="_blank"> Visit website </a>
<a class="dev-link" href="mailto:[email protected]"
rel="nofollow" target="_blank">[email protected] </a> ##Interesting part here
</div>
</html>
def get_app_from_link(self,link):
start_page=requests.get(link)
#print start_page.text
tree = html.fromstring(start_page.text)
name = tree.xpath('//div[@class="id-app-title"]/text()')[0]
#developer=tree.xpath('//div[@class="dev-link"]//*/div/@href')
developer=tree.xpath('//div[contains(@href,"mailto") and @class="dev-link"]/text()')
print name,developer
return
Now you are using tag div
, not a
:
'//a[contains(@href,"mailto") and @class="dev-link"]/text()'
Also, your function don't return items. Use return
like:
def get_app_from_link(self,link)::
# your code
return name, developer