Search code examples
pythonxpathfinancestock

XPath: Assessing an Error in this Line of Code?


I recently began learning XPath for a Python project, but I can't seem to get the following line selecting the correct piece of data.

//table[@id="yfncsumtab"]//tr/td/a[@rel="first"]

Said data is found on this page:http://finance.yahoo.com/q/hp?s=QQQX+Historical+Prices

(Inspect Element the "Next" link to get to the code I'm attempting to create an XPath to. In other words, Command/Control F on that page, and Inspect Element the first result)

I've tried many variations of that code, but none seem to select the proper text. I appreciate any and all help - thanks in advance!


Solution

  • '//a[text()="Next"]'
    

    or:

    '//table[@id = "yfncsumtab"]//a[text()="Next"]'
    

    or, to get just the first one:

    '//table[@id = "yfncsumtab"]//table[1]/tr/td/a[text()="Next"]'
    

    or:

    '//table[@id="yfncsumtab"]/tr[2]/td[1]/table[1]/tr/td/a[1]'
    

    The more specific you are, the faster it is to find the element. However, the more specific you are, the more brittle the xpath is: if the developers make a small change in the html structure surrounding the target element, your code won't work.

    from lxml import html
    
    doc = html.parse("http://finance.yahoo.com/q/hp?s=QQQX+Historical+Prices")
    
    my_xpath = '//a[text()="Next"]'
    
    for element in doc.xpath(my_xpath):
        print("<{}>".format(element.tag))
        print("  text = {}".format(element.text))
    
        for attr, val in element.items():
            print("  {} = {}".format(attr, val))
    
    
    --output:--
    <a>
      text = Next
      rel = next
      href = /q/hp?s=QQQX&d=11&e=28&f=2014&g=d&a=1&b=1&c=2007&z=66&y=66
    <a>
      text = Next
      rel = next
      href = /q/hp?s=QQQX&d=11&e=28&f=2014&g=d&a=1&b=1&c=2007&z=66&y=66