Search code examples
c#.netsql-servermachine-learningml.net

Training ML.NET -- System.OutOfMemoryException


First off, I am new to ML.NET (and ML as a whole). I am trying to set up a model using a SQL Server table as my data source. I am selecting one label and 18 features from the same table and this table contains a little more than 3 million records in it. When I finish selecting my label/features and click on the Train button, I get a prompt telling me that VS will download 1.1 GB of data from the SQL Server (hosted on the same machine) which I acknowledge. I get feedback indicating that the download is in progress and this lasts for 30 - 60 seconds. Then I get the following error:

Error retrieving SQL data: "Exception of type 'System.OutOfMemoryException' was thrown."
   at Microsoft.ML.ModelBuilder.ToolWindows.ModelBuilderDataContext.<DownloadSqlFileAsync>b__88_0()
   at System.Threading.Tasks.Task`1.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.ModelBuilderDataContext.<DownloadSqlFileAsync>d__88.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.ModelBuilderDataContext.<<OnDataChanged>b__77_1>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.TrainTabDataContext.<BuildTrainModelParametersAsync>d__138.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ML.ModelBuilder.ToolWindows.TrainTabDataContext.<StartTrainingAsync>d__130.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Microsoft.ML.ModelBuilder.ToolWindows.TrainTabControl.<<StartTraining_Click>b__5_0>d.MoveNext()

Some fun facts:

  • I've watched the RAM count on the machine while the attempt to train is made and it's not getting above 65% of total RAM available.

  • In the same VS solution, I have another app where I routinely read the entirety of the table in question into memory via EF.

  • I am using VS Community and SQL Express

  • I see the RAM count increase by maybe 3 or so GB before the error occurs. It smells so badly like it's running the process in 32-bit (which would make sense of all of this) but if there's a setting for this, I can't find it. I've checked the Build properties for my ML project and made sure that's set to 64-bit but I'm not sure that's even what is being used when you're training the model.


Solution

  • The ModelBuilder is (necessarily) a 32-bit extension and so it cannot process as much data as I was trying to push to it. I've opened a bug / feature request to get the data introduction into some 64-bit code or else change the way the data is ingested.

    https://github.com/dotnet/machinelearning-modelbuilder/issues/647