Search code examples
xmlgorss

Parsing RSS feed in Go


I am trying to write a podcast downloader in Go. The following code parses an RSS feed but the link of the channel is empty when printing the parsed data to the standard output. I don't know why. Any suggestions? I am new to Go.

package main

import (
    "encoding/xml"
    "fmt"
    "net/http"
)

type Enclosure struct {
    Url    string `xml:"url,attr"`
    Length int64  `xml:"length,attr"`
    Type   string `xml:"type,attr"`
}

type Item struct {
    Title     string    `xml:"title"`
    Link      string    `xml:"link"`
    Desc      string    `xml:"description"`
    Guid      string    `xml:"guid"`
    Enclosure Enclosure `xml:"enclosure"`
    PubDate   string    `xml:"pubDate"`
}

type Channel struct {
    Title string `xml:"title"`
    Link  string `xml:"link"`
    Desc  string `xml:"description"`
    Items []Item `xml:"item"`
}

type Rss struct {
    Channel Channel `xml:"channel"`
}

func main() {
    resp, err := http.Get("http://www.bbc.co.uk/programmes/p02nrvz8/episodes/downloads.rss")
    if err != nil {
        fmt.Printf("Error GET: %v\n", err)
        return
    }
    defer resp.Body.Close()

    rss := Rss{}

    decoder := xml.NewDecoder(resp.Body)
    err = decoder.Decode(&rss)
    if err != nil {
        fmt.Printf("Error Decode: %v\n", err)
        return
    }

    fmt.Printf("Channel title: %v\n", rss.Channel.Title)
    fmt.Printf("Channel link: %v\n", rss.Channel.Link)

    for i, item := range rss.Channel.Items {
        fmt.Printf("%v. item title: %v\n", i, item.Title)
    }
}

Solution

  • The xml from the rss feed has a channel element with two child 'link' elements: 'link' and 'atom:link'. Even though one has a namespace prefix, Go xml unmarshaller sees a conflict. See also local name collisions fail and issue on github.

    <?xml version="1.0" encoding="UTF-8"?>
    ...
       <channel>
          <title>Forum - Sixty Second Idea to Improve the World</title>
          <link>http://www.bbc.co.uk/programmes/p02nrvz8</link>
          ...
          <atom:link href="http://www.bbc.co.uk/..." />