Search code examples
xmlgolibxml2

How do I parse xml with a namespace using gokogiri (libxml2)?


I am using github.com/moovweb/gokogiri to parse an XML document. The following works when parsing var b but when I try the same on var a (which has a namespace) I get no output. How do I parse XML that has a namespace using gokogiri?

package main

import (
    "github.com/moovweb/gokogiri"
    "github.com/moovweb/gokogiri/xpath"
    "log"
)

func main() {
    log.SetFlags(log.Lshortfile)
    doc, _ := gokogiri.ParseXml([]byte(a))
    defer doc.Free()
    doc.SetNamespace("", "http://example.com/this")
    x := xpath.Compile(".//NodeA/NodeB")
    groups, err := doc.Search(x)
    if err != nil {
        log.Println(err)
    }
    for i, group := range groups {
        log.Println(i, group)
    }
}

var a = `<?xml version="1.0" ?><NodeA xmlns="http://example.com/this"><NodeB>thisthat</NodeB></NodeA>`
var b = `<?xml version="1.0" ?><NodeA><NodeB>thisthat</NodeB></NodeA>`

EDIT #1: I've also tried doc.RegisterNamespace but got

doc.RegisterNamespace undefined (type *xml.XmlDocument has no field or method RegisterNamespace)"

and x.RegisterNamespace getting

x.RegisterNamespace undefined (type *xpath.Expression has no field or method RegisterNamespace)"


Solution

  • Even though the namespace used in the XML is assigned no prefix (i.e. is default), you do need to register one and use it in your xpath expression.

    This prefix can be anything you like, here I used ns. Note it can be different from the prefix used in the document (if any) - the important part that needs to match is the namespace string itself.


    Example:

    package main
    
    import (
        "fmt"
        "github.com/moovweb/gokogiri"
        "github.com/moovweb/gokogiri/xpath"
    )
    
    func main() {
        doc, _ := gokogiri.ParseXml([]byte(a))
        defer doc.Free()
        xp := doc.DocXPathCtx()
        xp.RegisterNamespace("ns", "http://example.com/this")
        x := xpath.Compile("/ns:NodeA/ns:NodeB")
        groups, err := doc.Search(x)
        if err != nil {
            fmt.Println(err)
        }
        for i, group := range groups {
            fmt.Println(i, group.Content())
        }
    }
    
    var a = `<?xml version="1.0" ?><NodeA xmlns="http://example.com/this"><NodeB>thisthat</NodeB></NodeA>`
    

    Output:

    0 thisthat