I am trying to make a helper function to make inserts to a SQLite database easy to generalize.
For example:
member __.RegisterUsers (users : UserData array) =
insertOrReplace "User" users [|
("id", fun u -> u.id)
("nick", fun u -> u.nick)
("is_verified", fun u -> u.is_verified)
("is_banned", fun u -> u.is_banned)
("subscriptions", fun u -> u.num.subscriptions)
("subscribers", fun u -> u.num.subscribers)
("total_posts", fun u -> u.total_posts)
("original_nick", fun u -> u.original_nick)
|]
My implementation is as follows:
let insertOrReplace (table:string) (items:'T array) (args:(string * ('T -> obj)) array)=
let fieldNames = args |> Seq.map fst |> Seq.toArray
let parameterNames = fieldNames |> Array.mapi (fun i name -> $"@{name}{i}")
let commandText =
StringBuilder()
.Append($"INSERT OR REPLACE INTO {table} (")
.AppendJoin(", ", fieldNames)
.Append(") VALUES (")
.AppendJoin(", ", parameterNames)
.Append(");")
.ToString()
use transaction = connection.BeginTransaction()
use command = new SqliteCommand(commandText, connection, transaction)
let parameters =
parameterNames |> Array.map (fun name ->
let parameter = command.CreateParameter()
parameter.ParameterName <- name
command.Parameters.Add(parameter)
)
try
let modifiedFields = items |> Array.map (fun item ->
Array.iter2 (fun (name, selector) (parameter:SqliteParameter) -> parameter.Value <- selector item) args parameters
command.ExecuteNonQuery()
)
transaction.Commit()
modifiedFields |> Array.sum |> Ok
with
| _ as ex -> Error ex.Message
This uses Microsoft's Microsoft.Data.Sqlite library
The table definition I'm trying to insert to:
CREATE TABLE IF NOT EXISTS User (
id TEXT PRIMARY KEY,
nick TEXT NOT NULL,
is_verified INTEGER NOT NULL,
is_banned INTEGER NOT NULL,
subscriptions INTEGER NOT NULL,
subscribers INTEGER NOT NULL,
total_posts INTEGER NOT NULL,
original_nick TEXT NOT NULL
);
This "works" in the sense that the data is inserted correctly, but it's extremely slow and takes about 75-110 ms per record. For a single record, that's negligible, but I'm trying to insert hundreds at a time and it's becoming a serious bottleneck.
Things I've tried:
The problem was not with any of the code shown here, but with the several other tables that all had foreign key references to the User
table. There were several hundred thousand records with foreign key references to User
s, and apparently checking foreign key integrity for that many was just too much.
Data integrity is not a huge concern for this project and missing key references are not a problem, so I decided to remove all foreign key constraints. I left primary key constraints in place, as that is important for my use case.
After rebuilding the database with no foreign keys, insert times are down to about 10-30ms per batch as opposed to the 10-30+ seconds previously.
For anyone in the future with a similar problem where dropping foreign key constraints is acceptable for their project, I found this guide (archive).