Search code examples
laravelmariadb

How to correctly bulk insert in MariaDB


Problem background: I am trying to do a bulk insert of many insert statements in laravel using DB::unprepared function by constructing and executing a procedure in MariaDB:

DELIMITER //
DROP PROCEDURE IF EXISTS procedure_name;
CREATE PROCEDURE procedure_name;
BEGIN
    BEGIN
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
        /* (996 more insert statements) */
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
    END;
    BEGIN
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
        /* (996 more insert statements) */
        INSERT INTO table_name (param1, param2, ...) VALUES (val1, val2, ...);
    END;
    /* many more such BEGIN END inserts */
END // 
DELIMITER ;
CALL procedure_name;
DROP PROCEDURE IF EXISTS procedure_name;

Each inner BEGIN ... END pair has 1000 insert statements. In a procedure there can be many (a lot more than 20) such BEGIN ... END pairs.

I am using and executing a procedure because this is the only way I got bulk insert to work on test data. I was referencing this MariaDB documentation. When inserting real data that produces more than 100 insert statements I get this error:

ERROR 1064 (42000) at line 5: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'BEGIN
BEGIN
        INSERT INTO items (vin, destination, transport_id, port_of_origi...' at line 2

I searched for some answers like this one.

I also tried to omit DELIMITER, without success.


Solution

  •     INSERT INTO table_name (param1, param2, ...)
            VALUES
                (val1, val2, ...),
                (val1, val2, ...),
                ... /* (997 more "row constructors") */
                (val1, val2, ...);
    

    This does not need to be in a stored Proc. But if it is, you need to use a pair of DELIMITER statements when defining the SP.

    1000 rows is a good number. More than 1K might get into various limits or allocation issues. Less than 1K would be slightly slower. Roughly speaking, 10 units of overhead for INSERT, 1 unit of processing time per row. So 1K rows is 99% of the theoretical limit.

    If you autocommit=ON and don't have transactions around the set of them, this also feeds into the efficiency. But a crash will stop after some multiple of 1K rows.

    Insert less than 1K rows if the text string is bigger than max_allowed_packet.