Search code examples
mysqlsqlscalaslicktypesafe

Slick 3.0 bulk insert or update (upsert)


what is the correct way to do a bulk insertOrUpdate in Slick 3.0?

I am using MySQL where the appropriate query would be

INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

MySQL bulk INSERT or UPDATE

Here is my current code which is very slow :-(

// FIXME -- this is slow but will stop repeats, an insertOrUpdate
// functions for a list would be much better
val rowsInserted = rows.map {
  row => await(run(TableQuery[FooTable].insertOrUpdate(row)))
}.sum

What I am looking for is the equivalent of

def insertOrUpdate(values: Iterable[U]): DriverAction[MultiInsertResult, NoStream, Effect.Write]

Solution

  • There are several ways that you can make this code faster (each one should be faster than the preceding ones, but it gets progressively less idiomatic-slick):

    • Run insertOrUpdateAll instead of insertOrUpdate if on slick-pg 0.16.1+

      await(run(TableQuery[FooTable].insertOrUpdateAll rows)).sum
      
    • Run your DBIO events all at once, rather than waiting for each one to commit before you run the next:

      val toBeInserted = rows.map { row => TableQuery[FooTable].insertOrUpdate(row) }
      val inOneGo = DBIO.sequence(toBeInserted)
      val dbioFuture = run(inOneGo)
      // Optionally, you can add a `.transactionally`
      // and / or `.withPinnedSession` here to pin all of these upserts
      // to the same transaction / connection
      // which *may* get you a little more speed:
      // val dbioFuture = run(inOneGo.transactionally)
      val rowsInserted = await(dbioFuture).sum
      
    • Drop down to the JDBC level and run your upsert all in one go (idea via this answer):

      val SQL = """INSERT INTO table (a,b,c) VALUES (?, ?, ?)
      ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);"""
      
      SimpleDBIO[List[Int]] { session =>
        val statement = session.connection.prepareStatement(SQL)
        rows.map { row =>
          statement.setInt(1, row.a)
          statement.setInt(2, row.b)
          statement.setInt(3, row.c)
          statement.addBatch()
        }
        statement.executeBatch()
      }