Search code examples
machine-learningregressionazure-machine-learning-service

What is wrong with my experiment (Trying to predict car sales)?


I have the dataset like this (just a sample of it):

DATE_REF,MONTH,YEAR,DAY_OF_YEAR,DAY_OF_MONTH,WEEK_DAY,WEEK_DAY_1,WEEK_DAY_2,WEEK_DAY_3,WEEK_DAY_4,WEEK_DAY_5,WEEK_DAY_6,WEEK_DAY_7,WEEK_NUMBER_IN_MONTH,WEEKEND,WORK_DAY,AMOUNT_SOLD
20100101,1,2010,1,1,6,0,0,0,0,0,1,0,1,0,0,0
20100102,1,2010,2,2,7,0,0,0,0,0,0,1,1,1,0,2
20100103,1,2010,3,3,1,1,0,0,0,0,0,0,2,1,0,0
20100104,1,2010,4,4,2,0,1,0,0,0,0,0,2,0,1,12830
20100105,1,2010,5,5,3,0,0,1,0,0,0,0,2,0,1,19200
20100106,1,2010,6,6,4,0,0,0,1,0,0,0,2,0,1,22930
20100107,1,2010,7,7,5,0,0,0,0,1,0,0,2,0,1,23495
20100108,1,2010,8,8,6,0,0,0,0,0,1,0,2,0,1,23215
20100109,1,2010,9,9,7,0,0,0,0,0,0,1,2,1,0,172
20100110,1,2010,10,10,1,1,0,0,0,0,0,0,3,1,0,0
20100111,1,2010,11,11,2,0,1,0,0,0,0,0,3,0,1,18815
20100112,1,2010,12,12,3,0,0,1,0,0,0,0,3,0,1,25415
20100113,1,2010,13,13,4,0,0,0,1,0,0,0,3,0,1,25262
20100114,1,2010,14,14,5,0,0,0,0,1,0,0,3,0,1,27967
20100115,1,2010,15,15,6,0,0,0,0,0,1,0,3,0,1,26352
20100116,1,2010,16,16,7,0,0,0,0,0,0,1,3,1,0,202
20100117,1,2010,17,17,1,1,0,0,0,0,0,0,4,1,0,10
20100118,1,2010,18,18,2,0,1,0,0,0,0,0,4,0,1,20295
20100119,1,2010,19,19,3,0,0,1,0,0,0,0,4,0,1,25982
20100120,1,2010,20,20,4,0,0,0,1,0,0,0,4,0,1,24745
20100121,1,2010,21,21,5,0,0,0,0,1,0,0,4,0,1,28087
20100122,1,2010,22,22,6,0,0,0,0,0,1,0,4,0,1,28417
20100123,1,2010,23,23,7,0,0,0,0,0,0,1,4,1,0,115
20100124,1,2010,24,24,1,1,0,0,0,0,0,0,5,1,0,5
20100125,1,2010,25,25,2,0,1,0,0,0,0,0,5,0,1,20185
20100126,1,2010,26,26,3,0,0,1,0,0,0,0,5,0,1,25932
20100127,1,2010,27,27,4,0,0,0,1,0,0,0,5,0,1,31710
20100128,1,2010,28,28,5,0,0,0,0,1,0,0,5,0,1,21020
20100129,1,2010,29,29,6,0,0,0,0,0,1,0,5,0,1,51460
20100130,1,2010,30,30,7,0,0,0,0,0,0,1,5,1,0,670
20100131,1,2010,31,31,1,1,0,0,0,0,0,0,6,1,0,17

I'm trying to predict the AMOUNT_SOLD for new dates (DATE_REF) using the following experiment on Azure ML:

Azure ML Experiment

Then I deployed the Web Service and tested the prediction, but all I got was zero for the AMOUNT_SOLD column.

What may I be missing?


Solution

  • As much as I want to replicate your Azure ML experiment, I do not have enough data. But what I've done are as follows:

    enter image description here

    I copied your sample data, and then multiplied it by 4 times (Add Rows x 2). Then Split Data (70%/30%), random seed 7 (for reproducible results). The Boosted Decision Tree Regression has default parameters. On Tune Model Hyperparameters, I selected AMOUNT_SOLD as the label column. Then Score Model and Evaluate Model.

    enter image description here

    Accuracy / Coefficient of Determination was pretty good.

    After that, to deploy this as a web service, you must setup first a Predictive Experiment from your Training Experiment. Setup Web Service > Predictive Experiment You experiment will move like magic.

    enter image description here

    The Web Service Input module will be placed by default at the top of the experiment. I moved it and connected at the right side of Score Model, so that when you are inputting the parameters of your web service, it will be predicted using your Trained Model.

    After the Score Model module, I placed a Select Columns in Dataset module and selected only the column named Scored Labels. This column contains the model's predictions. Then I used Edit Metadata module to rename the Scored Labels column, before passing it to the Web Service Output module.

    Your experiment is now ready to deploy as a web service.

    To predict new values, I tested the web service using the current date details as input. (Although the DATE_REF input must be 20170818 :D )

    enter image description here

    And then the output looks like this:

    enter image description here

    Your web service can now predict new values.