Search code examples
c#machine-learningpredictionml.net

ML.Net array data input


I am having a hard time to Predict monthly sales with daily sales data input using Microsoft .ML

    class Data
        {
            [Column(ordinal: "0", name: "Label")]
            public float PredictedProfit;
            [Column(ordinal: "Month")]
            public int Month;
            [Column(ordinal: "DayOfMonth")]
            public int DayOfMonth;
            [Column(ordinal: "Sales")]
            public double[] Sales;
            [Column(ordinal: "MonthlyProfit")]
            public double MonthlyProfit;
    }
    ...........................
     MLContext mlContext = new MLContext(seed: 0);
    List<VData> listData;
    VData row=new VData();
    .....
    fill row
    .....
    listData.Add(row);
    var trainData = mlContext.CreateStreamingDataView<VData>(listData);   

    var pipeline = mlContext.Transforms.CopyColumns("Label", "MonthlyProfit");            

    pipeline.Append(mlContext.Transforms.Concatenate("Features", "MonthlyProfit", "Sales", "Month", "DayOfMonth");

    pipeline.Append(mlContext.Regression.Trainers.FastTree());

    var model = pipeline.Fit(trainData);

    var dataView = mlContext.CreateStreamingDataView<VData>(listData);
    var predictions = model.Transform(dataView);
    var metrics = mlContext.Regression.Evaluate(predictions, "Label", "MonthlyProfit");

metrics value is always zero, and no predicted data


Solution

  • Pipelines in ML.NET are immutable: calls to pipeline.Append return a new updated pipeline, but don't change the original pipeline.

    Modify your code to do:

    var pipeline = mlContext.Transforms.CopyColumns("Label", "MonthlyProfit");            
    
    pipeline = pipeline.Append(mlContext.Transforms.Concatenate("Features", "MonthlyProfit", "Sales", "Month", "DayOfMonth");
    
    pipeline = pipeline.Append(mlContext.Regression.Trainers.FastTree());
    

    In addition, the [Column] attribute you are using is having no effect. In order to change the label column's name, you can use [ColumnName("Label")]. All other attributes are completely unnecessary.