Search code examples
mysqlgoconcurrencysegmentation-faultgo-gorm

Concurrent MySQL writing with GORM leads to an error


I have implemented a complex csv import script in Golang. I use a Workerpool implementation for it. Inside that workerpool, workers run through 1000s of small csv files, categorizing, tagging and branding the products. And they all write to the same database table. So far so good.

The problem i'm facing is, that if i chose more than 2 workers, the process crashes with the following message randomly

console output

The workflow is

foreach (csv) {
 workerPool.submit(csv)
}

func worker(csv) {
 foreach (line) {
   import(line)
 }
}

import(line) {
 product = get(line)
 product.category = determine_category(product)
 product.brand = determine_brand(product)
 save(brand)
 product.tags = determine_tags(product)
 //and after all
 save(product)
}

I tried to wrap the save() calls in transactions, but it didn't help.

Now i have the following questions:

  1. Is MySQL suited to save concurrently to 1 table?
  2. If transactions are need to accomplish this, where should they be set?
  3. Is the Go SQL Driver (where the error ALWAYS happens in packets.go:1102) suited to do this ?
  4. Could anyone help me (maybe by hiring for a few hours)?

I'm completely stuck. I can also share the sourcecode if that helps. But I first wanted to know i you guess that it's rather my code or a general issue.


Solution

  • Open a new db connection in each goroutine (or thread, for languages that use threads).

    MySQL's protocol is stateful, which means if multiple goroutines attempt to use the same connection, the requests and responses get very confused.

    You would have the same problem trying to share any other kind of stateful protocol connection between goroutines.

    For example ftp is also a stateful protocol, and that may be easier to understand. A client goroutine might send a message like "get file x" and the response should be a series of messages containing the content of that file. If another goroutine tries to use the same connection while that request/response is inprogress, both clients will be confused. The second goroutine will read packets that belong to a file it didn't request. The first goroutine who requested the file will find some packets it was expecting have already been read.

    Similarly, MySQL's protocol does not support multiple client goroutines sharing a single connection.