Search code examples
c#html-agility-pack

HTML Agility Pack Null Reference


I've got some trouble with the HTML Agility Pack.

I get a null reference exception when I use this method on HTML not containing the specific node. It worked at first, but then it stopped working. This is only a snippet and there are about 10 more foreach loops that selects different nodes.

What am I doing wrong?

public string Export(string html)
{
    var doc = new HtmlDocument();
    doc.LoadHtml(html);
    // exception gets thrown on below line
    foreach (var repeater in doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']"))
    {
        if (repeater != null)
        {
            repeater.Name = "editor:repeater";
            repeater.Attributes.RemoveAll();
        }
    }

    var sw = new StringWriter();
    doc.Save(sw);
    sw.Flush();

    return sw.ToString();
}

Solution

  • AFAIK, DocumentNode.SelectNodes could return null if no nodes found.

    This is default behaviour, see a discussion thread on codeplex: Why DocumentNode.SelectNodes returns null

    So the workaround could be in rewriting the foreach block:

    var repeaters = doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']");
    if (repeaters != null)
    {
        foreach (var repeater in repeaters)
        {
            if (repeater != null)
            {
                repeater.Name = "editor:repeater";
                repeater.Attributes.RemoveAll();
            }
        }
    }