Below is the following snippet to do data processing
var pipeline = _mlContext.Transforms.Conversion.ConvertType(new[] {
new InputOutputColumnPair("x1", "x"),
new InputOutputColumnPair("a1", "a"),
},
outputKind: DataKind.Single).Append(_mlContext.Transforms.Categorical.OneHotEncoding(new[] {
new InputOutputColumnPair("b1","b"),
new InputOutputColumnPair("c1","c")
})).Append(_mlContext.Transforms.SelectColumns("x1", "a1", "b1", "c1","Label"));
data = pipeline.Fit(data).Transform(data);
// Split the data into a training set and a test set
split = _mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
// Define the target column name
labelColumnName = nameof(Dataset.Label);
string[] featureColumnNames = data.Schema.AsQueryable()
.Select(column => column.Name)
// Get alll the column names
.Where(name => name != nameof(Dataset.Label)) // Do not include the Label column
.ToArray();
// Create the data process pipeline
var dataProcessPipeline = _mlContext.Transforms.Concatenate("Features", featureColumnNames)
.Append(_mlContext.Transforms.NormalizeMeanVariance(inputColumnName: "Features", outputColumnName: "FeaturesNormalizedByMeanVar"));
When I try to do a prediction like this:
public List<PredictionEngineOutput> Predict(string path, List<DataToPredict> dataPredict)
{
var model = _mlContext.Model.Load(path, out var schema);
// Create a prediction engine
var engine = _mlContext.Model.CreatePredictionEngine<DataToPredict, PredictionEngineSchema>(model);
List<PredictionEngineSchema> predictionList = new List<PredictionEngineSchema>();
var predictOutput = new List<PredictionEngineOutput>();
foreach (var data in dataPredict)
{
// Make the prediction
var prediction = engine.Predict(data);
predictionList.Add(prediction);
var p = new PredictionEngineOutput
{
PredictedLabel = prediction.PredictedLabel,
Probability = prediction.Probability,
Score = prediction.Score,
FeatureContributions = new List<float>()
};
predictOutput.Add(p);
foreach (var contribution in prediction.FeatureContributions.DenseValues())
{
p.FeatureContributions.Add(contribution);
}
}
return predictOutput;
}
An error comes up on this line: var prediction = engine.Predict(data);
System.InvalidOperationException: Operation is not valid due to the current state of the object.
When predicting I tried to do apply same transformers to the List<DataToPredict> dataPredict
after parsing it as a dataview
.
I'm not sure why that issue is happening but if you're looking to predict on multiple rows / data instances, I would recommend using the Transform
method. So it would look something like this:
var dataToPredictDataView = mlContext.Data.LoadFromEnumerable(dataPredict);
var predictions = model.Transform(dataToPredictDataView);
For more information, check out this how-to guide on making predictions.