Search code examples
c#out-of-memorysqlcommandvarbinarymax

Reading 200MB file from db throws Out Of Memory Exception


I am attempting to query the database and pull excel files that could be as large as 1 million rows (~200MB) stored as varbinary and pass it through a validator.

Our build server has 6GB of memory and a load-balanced processor and during runtime comes nowhere near maxing out the CPU or Memory.

Yet, after about 40 seconds the process throws an OutOfMemoryException.

Here's the stack trace:

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at System.Data.SqlTypes.SqlBinary.get_Value()
   at System.Data.SqlClient.SqlBuffer.get_ByteArray()
   at System.Data.SqlClient.SqlBuffer.get_Value()
   at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
   at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
   at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
   at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
   at System.Data.SqlClient.SqlCommand.ExecuteScalar()
   at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\3\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157
...
   at System.Data.SqlTypes.SqlBinary.get_Value()
   at System.Data.SqlClient.SqlBuffer.get_ByteArray()
   at System.Data.SqlClient.SqlBuffer.get_Value()
   at System.Data.SqlClient.SqlDataReader.GetValueFromSqlBufferInternal(SqlBuffer data, _SqlMetaData metaData)
   at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
   at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
   at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)
   at System.Data.SqlClient.SqlCommand.ExecuteScalar()
   at eConfirmations.DataService.FileServices.FileDataService.GetFileContent(Guid fileId) in d:\w1\3\s\Source\eConfirmations.DataService\FileServices\FileDataService.cs:line 157

And here's the my code that throws the exception:

    private byte[] GetFileContent(Guid fileId)
    {
        byte[] content;
        string connectionString = ConfigurationManager.ConnectionStrings["eConfirmationsDatabase"].ConnectionString;

        using (SqlConnection sqlConnection = new SqlConnection(connectionString))
        {
            using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
            {
                sqlCommand.CommandTimeout = 300;
                sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
                sqlConnection.Open();
                content = sqlCommand.ExecuteScalar() as byte[];
                sqlConnection.Close();
                sqlCommand.Dispose();
            }
            sqlConnection.Dispose();
        }
        return content;
    }

Is there a more efficient way to pull back this data or can we update a setting on our build server to avoid this error?


Solution

  • Ok so here's what's happening:

    Because this is running on a 32-bit build, the maximum memory allocation is 2GB but I'm still coming nowhere near that threshold.

    According to this stackoverflow post that is very similar to my situation, the .NET framework restricts objects to a limit of 256MB in memory.

    So even though my file is only 200MB, byte[]s and MemoryStreams expand by powers of 2 until they reach the 256MB necessary. When they expand, they create a new instance of the appropriate size and copy the old data over to the new one, effectively multiplying the memory usage by 3 which causes the exception.

    MSDN has an example of how to retrieve a large file using a FileStream, but instead of a FileStream, I use a static byte[] pre-initialized to the size of my data using this post.

    Here is my final solution:

        public File GetFileViaFileIdGuid(Guid fileId)
        {
            File file = new File();
            string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;
            using (var sourceSqlConnection = new SqlConnection(connectionString))
            {
                using (SqlCommand sqlCommand = sourceSqlConnection.CreateCommand())
                {
                    sqlCommand.CommandText = $"SELECT FileName, FileExtension, UploadedDateTime, DATALENGTH(Content) as [ContentLength] FROM dbo.[File] WHERE FileId = '{fileId}'";
                    sqlCommand.CommandType = CommandType.Text;
                    sqlCommand.CommandTimeout = 300;
                    sourceSqlConnection.Open();
    
                    var reader = sqlCommand.ExecuteReader();
                    while (reader.Read())
                    {
                        file.FileId = fileId;
                        file.FileExtension = reader["FileExtension"].ToString();
                        file.FileName = reader["FileName"].ToString();
                        file.UploadedDateTime = (DateTime)reader["UploadedDateTime"];
                        file.Content = new byte[Convert.ToInt32(reader["ContentLength"])];
                    }
    
                    reader.Close();
                    sourceSqlConnection.Close();
                }
            }
            file.Content = GetFileContent(file.FileId, file.Content.Length);
            return file;
        }
    

    And to fetch the Content:

        private byte[] GetFileContent(Guid fileId, int contentLength)
        {
            int outputSize = 1048576;
            int bufferSize = contentLength + outputSize;
            byte[] content = new byte[bufferSize];
            string connectionString = ConfigurationManager.ConnectionStrings["Database"].ConnectionString;
    
            using (SqlConnection sqlConnection = new SqlConnection(connectionString))
            {
                using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
                {
                    sqlCommand.CommandTimeout = 300;
                    sqlCommand.CommandText = $"SELECT Content FROM dbo.[File] WHERE FileId = '{fileId}'";
                    sqlConnection.Open();
                    using (SqlDataReader reader = sqlCommand.ExecuteReader(CommandBehavior.SequentialAccess))
                    {
    
                        while (reader.Read())
                        {
                            int startIndex = 0;
                            long returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
                            while (returnValue == outputSize)
                            {
                                startIndex += outputSize;
                                returnValue = reader.GetBytes(0, startIndex, content, startIndex, outputSize);
                            }
                        }
                    }
    
                    sqlConnection.Close();
                }
            }
            return content;
        }