Search code examples
goconcurrencygoroutine

Why is my code causing a stall or race condition?


For some reason, once I started adding strings through a channel in my goroutine, the code stalls when I run it. I thought that it was a scope/closure issue so I moved all code directly into the function to no avail. I have looked through Golang's documentation and all examples look similar to mine so I am kind of clueless as to what is going wrong.

func getPage(url string, c chan<- string, swg sizedwaitgroup.SizedWaitGroup) {
    defer swg.Done()
    doc, err := goquery.NewDocument(url)

    if err != nil{
        fmt.Println(err)
    }

    nodes := doc.Find(".v-card .info")
    for i := range nodes.Nodes {
        el := nodes.Eq(i)
        var name string
        if el.Find("h3.n span").Size() != 0{
            name = el.Find("h3.n span").Text()
        }else if el.Find("h3.n").Size() != 0{
            name = el.Find("h3.n").Text()
        }

        address := el.Find(".adr").Text()
        phoneNumber := el.Find(".phone.primary").Text()
        website, _ := el.Find(".track-visit-website").Attr("href")
        //c <- map[string] string{"name":name,"address":address,"Phone Number": phoneNumber,"website": website,};
        c <- fmt.Sprint("%s%s%s%s",name,address,phoneNumber,website)
        fmt.Println([]string{name,address,phoneNumber,website,})

    }
}

func getNumPages(url string) int{
    doc, err := goquery.NewDocument(url)
    if err != nil{
        fmt.Println(err);
    }
    pagination := strings.Split(doc.Find(".pagination p").Contents().Eq(1).Text()," ")
    numItems, _ := strconv.Atoi(pagination[len(pagination)-1])
    return int(math.Ceil(float64(numItems)/30))
}


func main() {
    arrChan := make(chan string)
    swg := sizedwaitgroup.New(8)
    zips := []string{"78705","78710","78715"}

    for _, item := range zips{
        swg.Add()
        go getPage(fmt.Sprintf(base_url,item,1),arrChan,swg)
    }
    swg.Wait()

}

Edit: so I fixed it by passing sizedwaitgroup as a reference but when I remove the buffer it doesn't work does that mean that I need to know how many elements will be sent to the channel in advance?


Solution

  • Issue

    Building off of Colin Stewart's answer, from the code you have posted, as far as I can tell, your issue is actually with reading your arrChan. You write into it, but there's no place where you read from it in your code.

    From the documentation :

    If the channel is unbuffered, the sender blocks until the receiver has received the value. If the channel has a buffer, the sender blocks only until the value has been copied to the buffer; if the buffer is full, this means waiting until some receiver has retrieved a value.

    By making the channel buffered, what's happening is your code is no longer blocking on the channel write operations, the line that looks like:

    c <- fmt.Sprint("%s%s%s%s",name,address,phoneNumber,website)
    

    My guess is that if you're still hanging at when the channel has a size of 5000, it's because you have more than 5000 values returned across all of your loops over node.Nodes. Once your buffered channel is full, the operations block until the channel has space, just like if you were writing to an unbuffered channel.

    Fix

    Here's a minimal example showing you how you would fix something like this (basically just add a reader)

    package main
    
    import "sync"
    
    func getPage(item string, c chan<- string) {
        c <- item
    }
    
    func readChannel(c <-chan string) {
        for {
            <-c
        }
    }
    
    func main() {
        arrChan := make(chan string)
        wg := sync.WaitGroup{}
        zips := []string{"78705", "78710", "78715"}
    
        for _, item := range zips {
            wg.Add(1)
            go func() {
                defer wg.Done()
                getPage(item, arrChan)
            }()
        }
        go readChannel(arrChan) // comment this out and you'll deadlock
        wg.Wait()
    }