Search code examples
c#databasesqliteinsertstreamreader

Is there a faster way to read from a file and insert into SQLite


Afternoon everyone. I'm not that familiar with SQLite so I haven't messed with all of the settings of the database. I'm rather familiar with SQL Server, Oracle and even some Access and mySQL. Well, currently, I'm taking a file with 110,000+ records and reading the file line by line, parsing the data and running an insert statement to a table. The table depends on the record type which is the first field of line. Well, I'm loading it right now, and it's been running for 12 minutes (as I write this) and has only imported 14,000 records. Doing the math, it means that it will take somewhere between 1 hour and 15 minutes to 1 hour and 30 minutes. Depending on how the rest of my system acts at the time. Because there are different records types, I couldn't do a bulk insert if there was an option for SQLite (not sure if there is). This is running as a background worker. Below is the function that pulls and parses the data, and the function that inserts it into the DB. Keep in mind, this is a C# application in MVC format (was like that when I took control over it and don't have time to restructure it):

MainForm.cs Background Worker Functions

#region Background Worker Functions

    #region private void InitializeBackgroundWorker()
    /*************************************************************************************
    *************************************************************************************/
    private void InitializeBackgroundWorker()
    {
        backgroundWorker.DoWork +=
            new DoWorkEventHandler(backgroundWorker1_DoWork);
        backgroundWorker.RunWorkerCompleted +=
            new RunWorkerCompletedEventHandler(
        backgroundWorker1_RunWorkerCompleted);
        backgroundWorker.ProgressChanged +=
            new ProgressChangedEventHandler(
        backgroundWorker1_ProgressChanged);
    }
    #endregion

/*****************************************************************************************************************************************************************************************************/

    #region private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
    /*************************************************************************************
    *************************************************************************************/
    private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
    {
        // Get the BackgroundWorker that raised this event.
        BackgroundWorker worker = sender as BackgroundWorker;

        // Assign the result of the computation
        // to the Result property of the DoWorkEventArgs
        // object. This is will be available to the 
        // RunWorkerCompleted eventhandler.

        //Creates a static singleton file list.  Remains on the stack and can be accessed anywhere without
        // reinstatiating
        object[] obj = (object[])e.Argument;
        string fileName = obj[0].ToString();
        DataController controller = new DataController(worker, e);
        controller.FileName = fileName;
        try
        {
            if (strProcess == "Import")
            {
                controller.Import();
            }
            else if (strProcess == "Export")
            {
                controller.ExportToExcel();
            }
        }
        catch (Exception ex)
        {
            MessageBox.Show(ex.Message.ToString());
        }
    }
    #endregion

/*****************************************************************************************************************************************************************************************************/

    #region private void backgroundWorker1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
    /*************************************************************************************
    *************************************************************************************/
    private void backgroundWorker1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
    {
        if (e.Error != null)
        {
            MessageBox.Show(e.Error.StackTrace);
        }
        else
        {
            this.toolStripStatusLabel1.Text = "Import complete";
            generateReport();
            treeViewFigure.Nodes.Clear();
            BuildTree();
            treeViewFigure.TopNode.ExpandAll();
            labelIPBNumber.Text = controller.IPBNumber;
            this.Text += "IPB: " + labelIPBNumber.Text;

            cmbIndentureLevel.Items.Clear();
        }
    }
    #endregion

/*****************************************************************************************************************************************************************************************************/

    #region private void backgroundWorker1_ProgressChanged(object sender, ProgressChangedEventArgs e)
    /*************************************************************************************
    *************************************************************************************/
    private void backgroundWorker1_ProgressChanged(object sender, ProgressChangedEventArgs e)
    {
        string stat = e.UserState.ToString();
        this.toolStripStatusLabel1.Text = "";
        this.toolStripStatusLabel1.Text = stat;
        this.toolStripProgressBar1.Value = e.ProgressPercentage;
    }
    #endregion

#endregion

Importer.cs Import Function

    #region public void Import(string fileName)
    /*************************************************************************************
    *************************************************************************************/
    public void Import(string fileName)
    {
        if (!File.Exists(fileName))
        {
            throw new FileNotFoundException();
        }

        StreamReader read = File.OpenText(fileName);
        List<RecordBase> List = new List<RecordBase>();
        DataFactory factory = DataFactory.BuildFactory();

        int nLines = 0;

        while (!read.EndOfStream)
        {
            read.ReadLine();
            nLines++;
        }

        read.Close();
        read = File.OpenText(fileName);

        factory.lstObservers = _observers;
        factory.ClearDB();

        int count = 1;

        while (!read.EndOfStream)
        {
            string[] fields = read.ReadLine().Split('|');
            List<string> lstStr = new List<string>();
            foreach (string str in fields)
            {
                lstStr.Add(str);
            }

            lstStr.RemoveAt(fields.Length - 1);
            fields = lstStr.ToArray();

            string strValues = string.Join("','", fields);
            strValues = "'" + strValues + "'";
            if (fields.Length >= 39 && fields[0] == "03")
            {
                factory.ImportTaggedRecord(fields[38], count);
                int nIndex = strValues.IndexOf(fields[38]);
                strValues = strValues.Substring(0, nIndex - 2);
            }

            factory.ImportIPB(strValues, fields[0], count);

            progress.ProgressComplete = (count * 100) / nLines;
            progress.Message = "Importing Record: " + count++.ToString();
            Notify();
        }
    }
    #endregion

DataFactory.cs ImportIPB function

    #region public void ImportIPB(string strValues, string strType)
    /*************************************************************************************
    *************************************************************************************/
    public void ImportIPB(string strValues, string strType, int nPosition)
    {
        string strCommand = string.Empty;

        switch (strType)
        {
            case "01":
                strCommand = Queries.strIPBInsert;
                break;
            case "02":
                strCommand = Queries.strFigureInsert;
                break;
            case "03":
                strCommand = Queries.strPartInsert;
                break;
        }

        ExecuteNonQuery(strCommand + strValues + ", " + nPosition.ToString() + ")");
    }
    #endregion

Database.cs ExecuteNonQuery method

    #region public void ExecuteNonQuery(string strSQL)
    /*************************************************************************************
    *************************************************************************************/
    public void ExecuteNonQuery(string strSQL)
    {
        DbCommand dbCommand = _dbConnection.CreateCommand();
        dbCommand.CommandText = strSQL;
        dbCommand.Prepare();
        dbCommand.ExecuteNonQuery();
    }
    #endregion

Can anyone see anything from the provided that could be improved on? Are there settings to the background worker that can be set to work faster? Are there default settings for the background worker? What kind of settings are there on the db file that could be changed (using SQLite Expert Personal) to make inserts faster? Is it just the size of my file? Right now, as I finish this, it just went past 22 minutes and it's done 24,000 records. This isn't a time sensitive matter, so take all the time you need. Thanks.

Update: Also, I think I should mention that on one of the tables I have an integer primary key (acts as an identity field). Could that have any performance issues?


Solution

  • Use a single SQLiteTransaction around the entire insert. As is, it will force flushing to the file after every insert to preserve ACID-compliance. As with any DbConnection and DbTransaction, you use BeginTransaction, then when you're done, Commit. The entire insert will succeed or fail, and it will have better performance.