Search code examples
c#asp.net.netiisopenxml-sdk

C# - Browser is sending duplicate request and error connection was reset for long running process


I have a page with a button. On click I am fetching 539,200 rows from a table and creating excel file with OpenXML SDK. It creates more than 50MB file which I upload on an FTP through WebClient.UploadFile method. In this whole process the IIS Worker process CPU consumption goes till 30% and Memory utilization goes till 1.2 GB. Whole process takes around 10 minutes to complete on my server. After completion it takes additional 10 minutes for the browser (Firefox) and shows error "The connection was reset". After 10 minutes before previous execution completion I saw another execution started in the log. I am sure no one else is using that server.

My Questions

  1. Why the Memory utilization didn't come down after process is completed? I have carefully disposed each & every object. Even called Garbage Collector. I had to restart IIS to release memory. What can I do in code?

  2. As per my logs the total execution finished in 10 minutes but still browser didn't respond, It kept showing "Connecting.....". After another ~10 minutes it gave error "The connection was reset". What went wrong?

  3. Why I saw another execution log before previous ended? Is browser sending another request or IIS/ASP.Net gone mad?

  4. When used Ajax request for execution I saw repeated log entries after every ~10 minutes until I restarted IIS. What happened?

Repeated log entries translates to repeated execution of same process. I am going mad now.

I am using Windows Server 2012 with IIS 8.

Edit

To Get Data

    Function fun = new Function();
            List<SqlParameter> para = new List<SqlParameter>();
            para.Add(new SqlParameter() { ParameterName = "@IDs", SqlDbType = SqlDbType.NVarChar, Size = 4000, Value = "something" });
            para.Add(new SqlParameter() { ParameterName = "@Fromdate", SqlDbType = SqlDbType.Date, Value = "2017-06-01" });
            para.Add(new SqlParameter() { ParameterName = "@Todate", SqlDbType = SqlDbType.Date, Value = "2017-06-27" });
            dsExcel = fun.GetDataSet("sp_GetData", para);

Export to Excel (with Pivot Table)

    private bool ExportDSToExcel(string destination)
    {
        LogUtil.LogInfo("Writing excel with rows: " + dsExcel.Tables[0].Rows.Count);
        try
        {
            using (var spreadsheet = SpreadsheetDocument.Open(destination, true))
            {
                foreach (DataTable table in dsExcel.Tables)
                {
                    WorkbookPart workbookPart = spreadsheet.WorkbookPart;
                    WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
                    string origninalSheetId = workbookPart.GetIdOfPart(worksheetPart);
                    WorksheetPart replacementPart =
                    workbookPart.AddNewPart<WorksheetPart>();
                    string replacementPartId = workbookPart.GetIdOfPart(replacementPart);
                    DocumentFormat.OpenXml.OpenXmlReader reader = DocumentFormat.OpenXml.OpenXmlReader.Create(worksheetPart);
                    DocumentFormat.OpenXml.OpenXmlWriter writer = DocumentFormat.OpenXml.OpenXmlWriter.Create(replacementPart);
                    while (reader.Read())
                    {
                        if (reader.ElementType == typeof(SheetData))
                        {
                            if (reader.IsEndElement)
                                continue;
                            writer.WriteStartElement(new SheetData());
                            DocumentFormat.OpenXml.Spreadsheet.Row headerRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
                            writer.WriteStartElement(headerRow);

                            List<String> columns = new List<string>();
                            foreach (DataColumn column in table.Columns)
                            {
                                columns.Add(column.ColumnName);
                                Cell cell = new Cell();
                                cell.DataType = CellValues.String;
                                cell.CellValue = new CellValue(column.ColumnName);
                                writer.WriteElement(cell);
                            }
                            //End Row element writing
                            writer.WriteEndElement();

                            foreach (DataRow dsrow in table.Rows)
                            {
                                DocumentFormat.OpenXml.Spreadsheet.Row newRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
                                writer.WriteStartElement(newRow);
                                foreach (String col in columns)
                                {
                                    Cell cell = new Cell();
                                    if ((dsrow[col].GetType().ToString().Contains(TypeCode.Int32.ToString()) || (dsrow[col].GetType().ToString().Contains(TypeCode.Decimal.ToString())) || (dsrow[col].GetType().ToString().Contains(TypeCode.Int64.ToString()))))
                                    {
                                        cell.DataType = CellValues.Number;
                                    }
                                    else
                                    {
                                        cell.DataType = CellValues.String;
                                    }
                                    cell.CellValue = new CellValue(dsrow[col].ToString()); //
                                    writer.WriteElement(cell);
                                }
                                writer.WriteEndElement();
                            }
                            //End SheetData writing
                            writer.WriteEndElement();
                        }
                        else
                        {
                            if (reader.IsStartElement)
                            {
                                writer.WriteStartElement(reader);
                            }
                            else if (reader.IsEndElement)
                            {
                                writer.WriteEndElement();
                            }
                        }
                    }
                    reader.Close();
                    reader.Dispose();
                    writer.Close();
                    writer.Dispose();
                    Sheet sheet = workbookPart.Workbook.Descendants<Sheet>()
                    .Where(s => s.Id.Value.Equals(origninalSheetId)).First();
                    sheet.Id.Value = replacementPartId;
                    workbookPart.DeletePart(worksheetPart);
                }
                PivotTableCacheDefinitionPart ptp = spreadsheet.WorkbookPart.PivotTableCacheDefinitionParts.First();
                ptp.PivotCacheDefinition.RefreshOnLoad = true;
                ptp.PivotCacheDefinition.RecordCount = Convert.ToUInt32(dsExcel.Tables[0].Rows.Count);                    ptp.PivotCacheDefinition.CacheSource.WorksheetSource.Reference = "A1:" + IntToLetters(dsExcel.Tables[0].Columns.Count) + (dsExcel.Tables[0].Rows.Count + 1);                    ptp.PivotTableCacheRecordsPart.PivotCacheRecords.RemoveAllChildren();
                ptp.PivotTableCacheRecordsPart.PivotCacheRecords.Count = 0;
                spreadsheet.Save();
                spreadsheet.Close();
                spreadsheet.Dispose();
                //GC.Collect();
                //GC.WaitForPendingFinalizers();
            }
            LogUtil.LogInfo("Wrote excel");
            return true;
        }
        catch (Exception ex)
        {
            return false;
        }
    }

Upload to FTP

    public void UploadFileToFtp(string file)
    {
        FileInfo fileInfo = new FileInfo(file);
        using (WebClient client = new WebClient())
        {
            client.Credentials = ftpNetworkCredentials;
            client.UploadFile(ftpUri + fileInfo.Name, "STOR", file);
            client.Dispose();
        }
        LogUtil.LogInfo(file + " uploaded successfully");
    }

Button Click event code

    protected void btnSubmit_Click(object sender, EventArgs e)
    {
        LogUtil.LogInfo("Getting data");
        FillReportTable();
        LogUtil.LogInfo("File upload is disabled");
        string IOPath = Server.MapPath("~/Report-" + DateTime.Now.ToString("MM-dd-yyyy-hh-mm-ss") + ".xlsx");
        if (System.IO.File.Exists(IOPath))
        {
            System.IO.File.Delete(IOPath);
        }
        System.IO.File.Copy(Server.MapPath("~/TempReport.xlsx"), IOPath);
        ExportDSToExcel(IOPath);
        if (Convert.ToBoolean(ConfigurationManager.AppSettings["ftpUpload"].ToString()))
        {
            UploadToFTP(IOPath);
        }
        else
        {
            LogUtil.LogInfo("File upload is disabled");
        }            
        lblMessage.Text = "File uploaded successfully";
    }

Seems like the memory utilization is not coming down because the server is repeating the execution before finishing previous request either because browser keeps sending request after every 10 min. I checked with fiddler but it floods the log with polling calls. Could not check thoroughly. The same code takes same time and works fine when the page is accessed on same server. Creates problem when did same thing over internet. I have 10mbps connection at work and VM on Azure cloud.


Solution

  • There are many reasons you could be experiencing this behavior but in generally speaking you will need to jump through hoops to get get a browser to wait 10 minutes for response. Instead, generally speaking, you should return a response to the browser before then and poll your application periodically to see if the task has completed. Of course there are other methods, such as using websockets, to initiate the process and wait for a response.

    1. If your application still has a reference to the object it will stay in memory, even if you garbage collected. Its hard to say where this reference could be without seeing your code.

    2. As explained a browser will stop waiting for a response and close the underlying connection long before 10 minutes elapses.

    3. It is entirely possible that the browser, after closing the non-responsive connection tried the request automatically.

    4. Hard to tell without seeing your code.

    An easy way to run a long running task is trigger it using Ajax, as you stated you have done and run the task using a System.Threading.Tasks.Task. You can store a reference to the task however you would like. Ajax can then be used to poll the status of the task to check if has completed or not.

    Hears a boilerplate implementation that uses a Generic Handler on the server side to run a task that takes 1 minute to complete and an HTML page that uses JQuery to start the task using Ajax and monitor the progress.

    LongRunningTask.ashx

    <%@ WebHandler Language="C#" Class="LongRunningTask" %>
    
    using System;
    using System.Web;
    using System.Web.SessionState;
    using System.Web.Script.Serialization;
    using System.Threading.Tasks;
    public class LongRunningTask : IHttpHandler, IRequiresSessionState
    {
        private const string INVALID = "Invalid value for m";
        private const string SESSIONKEY = "LongRunningTask";
        private const string STARTED = "Task Started";
        private const string RUNNING = "Task Running";
        private const string COMPLETED = "Task Completed";
    
        public void ProcessRequest(HttpContext context)
        {
            HttpRequest request = context.Request;
            string m = request.QueryString["m"];
            switch (m)
            {
                case "start":
                    TaskRunner runner = new TaskRunner();
                    context.Session[SESSIONKEY] = runner.DoWork();
                    ShowResponse(context, STARTED);
                    break;
                case "progress":
                    Task<int> t = (Task<int>)context.Session[SESSIONKEY];
                    ShowResponse(context, t.IsCompleted ? COMPLETED : RUNNING);
                    return;
                default:
                    ShowResponse(context, INVALID);
                    break;
            }
        }
    
        private void ShowResponse(HttpContext context, string message)
        {
            JavaScriptSerializer ser = new JavaScriptSerializer();
            string json = ser.Serialize(message);
            context.Response.ContentType = "text/javascript";
            context.Response.Write(json);
        }
    
    
        public bool IsReusable
        {
            get
            {
                return false;
            }
        }
        private class TaskRunner
        {
            public bool Finished { get; set; }
            private Task<int> longTask;
            public TaskRunner()
            {
    
            }
            public Task<int> DoWork()
            {
                var tcs = new TaskCompletionSource<int>();
                Task.Run(async () =>
                {
                    // instead of the following line, launch you method here.
                    await Task.Delay(1000 * 60 * 1);
                    tcs.SetResult(1);
                });
                longTask = tcs.Task;
                return longTask;
            }
        }
    
    }
    

    RunLongTask.html

    <!DOCTYPE html>
    <html>
    <head>
        <meta charset="utf-8" />
        <title>Run Long Task</title>
        <script src="//code.jquery.com/jquery-2.2.4.min.js" type="text/javascript"></script>
        <script type="text/javascript">
            $(document).ready(function () {
                $('#runLongTask').click(function () { runLongTask(); })
            });
            function runLongTask() {
                $.ajax
                    ({
                        type: "GET",
                        url: "LongRunningTask.ashx?m=start",
                        dataType: 'json',
                        success: function (data) {
                            $('#status').html(data);
                            window.setTimeout(checkStatus, 1000);
                        }
                    });
            }
            function checkStatus() {
                $.ajax
                    ({
                        type: "GET",
                        url: "LongRunningTask.ashx?m=progress",
                        dataType: 'json',
                        success: function (data) {
                            $('#status').html(new Date() + ' ' + data);
                            if (data !== "Task Completed") {
                                window.setTimeout(checkStatus, 1000);
                            }
                        }
                    });
            }
        </script>
    </head>
    <body>
        <div>
            <input id="runLongTask" type="button" value="Run Long Task" title="Run Long Task" />
        </div>
        <div id="status"></div>
    </body>
    </html>
    

    Edit

    Seeing your newly added code, you can integrate the same approach.

    Add the Generic Handler to your project. You can remove the task runner and the switch for "start".

    Modify btnSubmit_Click to run your code inside of a task:

    protected void btnSubmit_Click(object sender, EventArgs e)
    {
            //prevent running of duplicate tasks.
            if(context.Session[SESSIONKEY]!=null && ((Task<int>)context.Session[SESSIONKEY]).IsCompleted==false) return;
            var tcs = new TaskCompletionSource<int>();
            Task.Run(async () =>
            {
    
    
                LogUtil.LogInfo("Getting data");
                FillReportTable();
                LogUtil.LogInfo("File upload is disabled");
                string IOPath = Server.MapPath("~/Report-" + DateTime.Now.ToString("MM-dd-yyyy-hh-mm-ss") + ".xlsx");
                if (System.IO.File.Exists(IOPath))
                {
                    System.IO.File.Delete(IOPath);
                }
                System.IO.File.Copy(Server.MapPath("~/TempReport.xlsx"), IOPath);
                ExportDSToExcel(IOPath);
                if (Convert.ToBoolean(ConfigurationManager.AppSettings["ftpUpload"].ToString()))
                {
                    UploadToFTP(IOPath);
                }
                else
                {
                    LogUtil.LogInfo("File upload is disabled");
                }
                tcs.SetResult(1);
            });
            context.Session[SESSIONKEY] = tcs.Task;
            lblMessage.Text = "File uploaded started";
    }
    

    Then in the HTML of your web form add the method to monitor the progress using ajax:

    <script>
        $(document).ready(function() {
            $('#btnSubmit").click(function() {checkStatus();});
        }
        function checkStatus() {
            $.ajax
                ({
                    type: "GET",
                    url: "LongRunningTask.ashx?m=progress",
                    dataType: 'json',
                    success: function (data) {
                        $('#lblMessage').html(new Date() + ' ' + data);
                        if (data !== "Task Completed") {
                            window.setTimeout(checkStatus, 1000);
                        }
                    }
                });
        }
    </script>
    

    Edit 2

    Your newly added code also explains why the data remains in memory. The data you are using is stored as a local variable. A reference to the page, and its variable, remains in the application's memory while the code in the submit button click method is executing. Even though the browser is disconnecting the connection, the code is continuing to execute on the server side. It will not be released from memory until it completes execute and the page life cycle completes. Garbage collection will not remove it because it is still being referenced.

    The entire reason you are using so much memory in the first place is you are exporting the data to excel. In my experience I have seen excel an amount of memory several times that amount of the original data set. In fact, with data sets as large as yours you a getting close to an out of memory exception. If you can use another option such as CSV your code will run orders of a magnitude faster, as in seconds instead of minutes.

    That said, once your page goes out of scope the memory it will no longer be in use and it the memory will be freed upon garbage collection. You ca force this to happen faster by using an entirely different class for all your operations including data retrieval, transformation and upload. If you instantiate this task in a separate thread, once the task completes the class and all of its variables will go out of scope and the memory will be released.