Search code examples
loopsgotokenize

"Infinite loop" causing unreachable code


i'm currently trying to work with the html tokenizer https://godoc.org/golang.org/x/net/html.

So what i want to do is following: get all links from url and if url contains a certain string -> add to url-list.

resp, err = client.Get("someurl")
var urls []string

if err != nil {
    log.Fatal(err)
}

z := html.NewTokenizer(resp.Body)

for {
    tt := z.Next()

    switch {
    case tt == html.ErrorToken:
        return
    case tt == html.StartTagToken:
        t := z.Token()

        isAnchor := t.Data == "a"
        if !isAnchor {
            continue
        }

        ok, url := getHref(t)
        if !ok {
            continue
        }
        if strings.Contains(url, "somestring") {
            urls = append(urls, url)
        }

    }
}

fmt.Println(urls)

This doesn't work since "fmt.Println(urls)" is unreachable. The loop ofc ends at some point.... but this doesn't compile. How do i get the code after the loop to be reachable?

Regards


Solution

  • There's no break in the loop. The only way it ends is via a return which sends control out of this function. This means that fmt.Println(urls) is not reachable.

    Try this:

    L:
    for {
        tt := z.Next()
    
        switch {
        case tt == html.ErrorToken:
            break L
        case tt == html.StartTagToken:
            t := z.Token()
    
            isAnchor := t.Data == "a"
            if !isAnchor {
                continue
            }
    
            ok, url := getHref(t)
            if !ok {
                continue
            }
            if strings.Contains(url, "somestring") {
                urls = append(urls, url)
            }
    
        }
    }