Search code examples
multithreadinggoiothread-safetygoroutine

Lock when using io.Copy in a goroutine


I have a slice (filesMeta) containing a large number of "FileMetadata" structs. I also have another slice (candidates) containing the index of some of those structs. What I'm trying to do is modify the filesMeta slice to add an md5 hash but only for the elements which indexes are in the candidates slice.

I'm using goroutines to parallelize the work but the io.Copy part is causing a lock and I don't understand why.

This is the code:

for i := range candidates{
    wg.Add(1)
    go func(i int) {
        defer wg.Done()
        filesMeta[candidates[i]].Hash = md5Hash(filesMeta[candidates[i]].FullPath)
    }(i)
}
wg.Wait()


func md5Hash(filePath string) string {
    file, err := os.Open(filePath)
    if err != nil { 
        panic(err) 
    }
    defer file.Close()

    hash := md5.New()
    if _, err := io.Copy(hash, file); err != nil {
        panic(err)
    }
    hashInBytes := hash.Sum(nil)

    return hex.EncodeToString(hashInBytes)
}

Thanks!

Edit: One more detail, it doesn't lock when the files being hashed are in my SSD but it does when the files are on a fileshare.

Edit2: I noticed I forgot to pass the wg, the code now looks like this (still getting the same error):

for i := range candidates{
    wg.Add(1)
    go func(i int, wg *sync.WaitGroup) {
        defer wg.Done()
        filesMeta[candidates[i]].Hash = md5Hash(filesMeta[candidates[i]].FullPath)
    }(i, &wg)
}
wg.Wait()


func md5Hash(filePath string) string {
    file, err := os.Open(filePath)
    if err != nil { 
        panic(err) 
    }
    defer file.Close()

    hash := md5.New()
    if _, err := io.Copy(hash, file); err != nil {
        panic(err)
    }
    hashInBytes := hash.Sum(nil)

    return hex.EncodeToString(hashInBytes)
}

Solution

  • MarcoLucidi was right, I was opening too many files at a time. I limited the number of concurrent goroutines and now it works fine.