Search code examples
c#exceloledb

import huge excel sheets to Datatable


I have a Excel Sheet with two tabs.

One is approx 700 k and other is around 25k. The issue is when i am loading the files my memory gets eaten up and it crashes! How to handle huge files as some might even be more than a million rows.

Here is my current code i am using:

  OleDbConnection cnn = new OleDbConnection("provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties=Excel 12.0;");
                cnn.Open();

                string qry = "SELECT * FROM [Detail$]";
                OleDbDataAdapter odp = new OleDbDataAdapter(qry, cnn);
                odp.Fill(detailTable);
                DataSet tmp = new DataSet();
                if (detailTable.Rows.Count > 0)
                {
                    Console.WriteLine("Total " + detailTable.Rows.Count + " Detail rows Loaded");
                    // MessageBox.Show("Input Sheet UPLOADED !");

                }
                qry = "SELECT * FROM [Gallery$]";
                OleDbDataAdapter odp1 = new OleDbDataAdapter(qry, cnn);
                odp1.Fill(galleryTable);
                if (galleryTable.Rows.Count > 0)
                {
                    Console.WriteLine("Total " + galleryTable.Rows.Count + " Gallery Numbers Loaded");
                    //  MessageBox.Show("Input Sheet UPLOADED !");

                }

Solution

  • Okay, what I can suggest you is to use the DbDataAdapter.Fill(Int32, Int32, DataTable[]) overload method of the DbDataAdapter class to work in a "chunk" mode:

    public int Fill(
        int startRecord,
        int maxRecords,
        params DataTable[] dataTables
    )
    

    Using this method and my code example, you can do your job with chunks of rows at a time instead of working with the full excel data in-memory. After each fill, dispose your temp data-table object and you'll be able to avoid of memory leaks this way.

    Here is how you can do it:

            const string fileName = "myData.xlsx";
            const string excelConnString = "provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties=Excel 12.0;";
    
            using (var cnn = new OleDbConnection(excelConnString))
            {
                cnn.Open();
    
                const string countQuery = "SELECT COUNT(*) FROM [Detail$]";
                using (var cmd = new OleDbCommand(countQuery, cnn))
                {
                    using (var reader = cmd.ExecuteReader())
                    {
                        if (reader == null) return;
    
                        reader.Read();
                        var rowsCount = ((int)reader[0]);
    
                        const string query = "SELECT * FROM [Detail$]";
                        using (var odp = new OleDbDataAdapter(query, cnn))
                        {
                            var detailTable = new DataTable();
                            var recordToStartFetchFrom = 0; //zero-based record number to start with.
                            const int chunkSize = 100;
                            while (recordToStartFetchFrom <= rowsCount)
                            {
                                var diff = rowsCount - recordToStartFetchFrom;
                                int internalChunkSize = diff < 100 ? diff : chunkSize;
                                odp.Fill(recordToStartFetchFrom, internalChunkSize, detailTable);
    
                                foreach (DataRow row in detailTable.Rows)
                                {
                                    Console.WriteLine("{1} {0}", row.ItemArray[0], row.ItemArray[1]);
                                }
    
                                Console.WriteLine("--------- {0}-{1} Rows Processed ---------", recordToStartFetchFrom, recordToStartFetchFrom + internalChunkSize);
    
                                recordToStartFetchFrom += chunkSize;
    
                                detailTable.Dispose();
                                detailTable = null;
                                detailTable = new DataTable();
                            }
                        }
                        Console.ReadLine();
                    }
                }
            }