Search code examples
c#xpathgoogle-financeargumentnullexception

System.ArgumentNullException when trying to access span with Xpath (C#)


So i've been trying to get a program working where I get info from google finance regarding different stock stats. So far I have not been able to get information out of spans. As of now I have hardcoded direct access to the apple stock. Link to Apple stock: https://www.google.com/finance?q=NASDAQ%3AAAPL&ei=NgItWIG1GIftsAHCn4zIAg

What i can't understand is that I receive correct output when I trying it in the chrome console with the following command:

$x("//*[@id=\"appbar\"]//div//div//div//span");

This is my current code in Visual studio 2015 with Html Agility Pack installed(I suspect a fault in currDocNodeCompanyName):

class StockDataAccess
{
    HtmlWeb web= new HtmlWeb();
    private List<string> testList;

    public void FindStock()
    {
        var histDoc = web.Load("https://www.google.com/finance/historical?q=NASDAQ%3AAAPL&ei=q9IsWNm4KZXjsAG-4I7oCA.html");
        var histDocNode = histDoc.DocumentNode.SelectNodes("//*[@id=\"prices\"]//table//tr//td");

        var currDoc = web.Load("https://www.google.com/finance?q=NASDAQ%3AAAPL&ei=CdcsWMjNCIe0swGd3oaYBA.html");
        var currDocNodeCurrency = currDoc.DocumentNode.SelectNodes("//*[@id=\"ref_22144_elt\"]//div//div");
        var currDocNodeCompanyName = currDoc.DocumentNode.SelectNodes("//*[@id=\"appbar\"]//div//div//div//span");

        var histDocText = histDocNode.Select(node => node.InnerText);
        var currDocCurrencyText = currDocNodeCurrency.Select(node => node.InnerText);
        var currDocCompanyName = currDocNodeCompanyName.Select(node => node.InnerText);

        List<String> result = new List<string>(histDocText.Take(6));
        result.Add(currDocCurrencyText.First());
        result.Add(currDocCompanyName.Take(2).ToString());
        testList = result;
    }

    public List<String> ReturnStock()
    {
        return testList;
    }
}

I have been trying the Xpath expression [text] and received an output that i can work with when using the chrome console but not in VS. I have also been experimenting with a foreach-loop, a few suggested it to others.

class StockDataAccess
{
    HtmlWeb web= new HtmlWeb();
    private List<string> testList;

    public void FindStock()
    {
        ///same as before

        var currDoc = web.Load("https://www.google.com/finance?q=NASDAQ%3AAAPL&ei=CdcsWMjNCIe0swGd3oaYBA.html");
        HtmlNodeCollection currDocNodeCompanyName = currDoc.DocumentNode.SelectNodes("//*[@id=\"appbar\"]//div//div//div//span");

        ///Same as before

        List <string> blaList = new List<string>();
        foreach (HtmlNode x in currDocNodeCompanyName)
        {
            blaList.Add(x.InnerText);
        }

        List<String> result = new List<string>(histDocText.Take(6));
        result.Add(currDocCurrencyText.First());
        result.Add(blaList[1]);
        result.Add(blaList[2]);

        testList = result;
    }

    public List<String> ReturnStock()
    {
        return testList;
    }
}

I would really appreciate if anyone could point me in the right direction.


Solution

  • If you check the contents of currDoc.DocumentNode.InnerHtml you will notice that there is no element with the id "appbar", therefore the result is correct, since the xpath doesn't return anything.

    I suspect that the html element you're trying to find is generated by a script (js for example), and that explains why you can see it on the browser and not on the HtmlDocument object, since HtmlAgilityPack does not render scripts, it only download and parse the raw source code.