Search code examples
predictionsupervised-learningfeaturetoolschurn

prediction and time series


how to decide how in advance my prediction is?

i am following the featuretools churn tutorial https://github.com/Featuretools/predict-customer-churn

what i don't quite understand how did it decide that the prediction is for one month in advance.. in previous churn examples i tried, i just get aggregated data ( it could be historical for a years or months) then i build churn model and predict but i don't know if my prediction is for a month a year or even how many days in advance how is that decided!. does it depend on the period of aggregation or the data i didn't use. i know cut off time is the time i want to make prediction but how do i tell the system i want to make prediction for 2 month in advance do i just disregard the data for the last two months by setting the cut_off time but provide the label after the two months and say my model based on the features i get is for a 2 month advanced prediction.

for ex. cut_off date is 1/8/2010 label is the customer state on 1/10/2010 so two months period is the advance prediction? and i used all historical data previous to cut_off time?

this might be a time series problem that is turned into a simple classification but i am not sure!


Solution

  • You pick the amount of time in advanced (called "lead time") using your domain expertise. Depending on the real world application the lead time might be more or less. Sometimes you might even build multiple models with different lead times to apply in different situations.

    You control the lead time by moving the cutoff earlier with respect to the time the label became known. So, the example you give looks correct.