Search code examples
c#wpfasynchronous

Read large Excel file with EPPlus using Async/Await C# 6.0


Currently, in a small but important project, I have to read a rather large Excel File. This is done with the help of the EPPlus library. Honestly, this is the first time that I'm trying to use Async/Await in a real-life scenario.

Since this application will be running on a limited number of PCs, the binaries will be shipped on the respective local PCs. The problem I am facing is that it takes quite a bit of time for the method to read the entire Excel file and the View is of course entering in a non-responding state ( obviously ).

Thus, I have tried to make the method to read the file asynchronously but it seems I have a problem with it the application shuts down just like someone would close it. Running the method with the debugger works - it reads the lines and adds them to the Data Table.

Code below:

 public async Task<DataTable> ReadFileContent(string filePath)
        {
            var rowStart = 7;
            var columnStart = 1;
            var existingFile = new FileInfo(filePath);
            using (var package = new ExcelPackage(existingFile))
            {
                var worksheet = package.Workbook.Worksheets["BOP"];
                var dt = new DataTable();

                // Compose the name of the table from:
                // - Product Type
                // - Customer
                // - Customer Project
                // * These can be found in the "CollaborationContext" sheet that is present in the ExcelFile.

                var composedName = String.Empty;
                var stringBuilder = new StringBuilder();
                if (package.Workbook.Worksheets["CollaborationContext"].SelectedRange["B6"].Value.Equals("object_type"))
                {
                    dt.TableName = package.Workbook.Worksheets["CollaborationContext"].SelectedRange["C9"].Value + " " +
                                   package.Workbook.Worksheets["CollaborationContext"].SelectedRange["D9"].Value + " " +
                                   package.Workbook.Worksheets["CollaborationContext"].SelectedRange["E9"].Value;
                }
              
                
                    dt.TableName = package.Workbook.Worksheets["CollaborationContext"].SelectedRange["B9"].Value + " " + 
                                   package.Workbook.Worksheets["CollaborationContext"].SelectedRange["C9"].Value + " " +
                                   package.Workbook.Worksheets["CollaborationContext"].SelectedRange["D9"].Value;


                // Get Columns
                await Task.Run(() =>
                {

                    for (var col = columnStart; col <= worksheet.Dimension.End.Column - 1; col++)
                        dt.Columns.Add(worksheet.Cells[6, col].Value.ToString());

                    // Place data into DataTable
                    for (int row = rowStart; row <= worksheet.Dimension.End.Row; row++)
                    {
                        var dr = dt.NewRow();
                        var x = 0;

                        for (var col = columnStart; col <= worksheet.Dimension.End.Column - 1; col++)
                        {
                            dr[x++] = worksheet.Cells[row, col].Value;
                        }
                        dt.Rows.Add(dr);
                    }
                });
              
                return dt;
            } 
            
        }

The code where the method is getting called via a ButtonClick Event is:

  private async  void BtnReadFileContent_OnClick(object sender, RoutedEventArgs e)
        {
  
            var dt =  await _sourceService.ReadFileContent(filePath);
            dataGrid.ItemsSource = dt.DefaultView;

            if (dt.Rows.Count > 0)
                BtnCreateXmlFile.IsEnabled = true;
        }

Something I am doing is wrong, the application hangs a bit and that is not the intention, I have tried to add Debug.WriteLine("test") immediately after var dt = await _sourceService.ReadFileContent(filePath); and it is getting hit only after the file has been read entirely, thus the behavior is just like it's running synchronously.

Any help would be gladly appreciated.

Thank you so much for your attention and participation.


Solution

  • Your ReadFileContent method isn't asynchronous by nature as it doesn't use any asynchronous API:s and using await Task.Run for asynchronous wrappers is generally considered to be a code smell: http://blog.stephencleary.com/2013/11/taskrun-etiquette-examples-dont-use.html

    Try to make the method synchronous:

    public DataTable ReadFileContent(string filePath)
    {
        var rowStart = 7;
        var columnStart = 1;
        var existingFile = new FileInfo(filePath);
        using (var package = new ExcelPackage(existingFile))
        {
            var worksheet = package.Workbook.Worksheets["BOP"];
            var dt = new DataTable();
    
            var composedName = String.Empty;
            var stringBuilder = new StringBuilder();
            if (package.Workbook.Worksheets["CollaborationContext"].SelectedRange["B6"].Value.Equals("object_type"))
            {
                dt.TableName = package.Workbook.Worksheets["CollaborationContext"].SelectedRange["C9"].Value + " " +
                               package.Workbook.Worksheets["CollaborationContext"].SelectedRange["D9"].Value + " " +
                               package.Workbook.Worksheets["CollaborationContext"].SelectedRange["E9"].Value;
            }
    
    
            dt.TableName = package.Workbook.Worksheets["CollaborationContext"].SelectedRange["B9"].Value + " " +
                           package.Workbook.Worksheets["CollaborationContext"].SelectedRange["C9"].Value + " " +
                           package.Workbook.Worksheets["CollaborationContext"].SelectedRange["D9"].Value;
    
    
            for (var col = columnStart; col <= worksheet.Dimension.End.Column - 1; col++)
                dt.Columns.Add(worksheet.Cells[6, col].Value.ToString());
    
            // Place data into DataTable
            for (int row = rowStart; row <= worksheet.Dimension.End.Row; row++)
            {
                var dr = dt.NewRow();
                var x = 0;
    
                for (var col = columnStart; col <= worksheet.Dimension.End.Column - 1; col++)
                {
                    dr[x++] = worksheet.Cells[row, col].Value;
                }
                dt.Rows.Add(dr);
            }
    
            return dt;
        }
    }
    

    ...and call it on a background thread:

    private void BtnReadFileContent_OnClick(object sender, RoutedEventArgs e)
    {
        Task.Run(() => _sourceService.ReadFileContent(filePath)) //<- this will run on a background thread
            .ContinueWith(task =>
            {
                //...and this will run back on the UI thread once the task has finished
                dataGrid.ItemsSource = task.Result.DefaultView;
                if (task.Result.Rows.Count > 0)
                    BtnCreateXmlFile.IsEnabled = true;
            }, System.Threading.CancellationToken.None, TaskContinuationOptions.None, TaskScheduler.FromCurrentSynchronizationContext());
    }