Search code examples
pythonmachine-learningscikit-learnlinear-regressionpredict

How to use .predict() in a Linear Regression model?


I'm trying to predict what a 15-minute delay in flight departure does to the flight's arrival time. I have thousands of rows as well as several columns in a DF. Two of these columns are dep_delay and arr_delay for departure delay and arrival delay. I have built a simple LinearRegression model:

y = nyc['dep_delay'].values.reshape((-1, 1))

arr_dep_model = LinearRegression().fit(y, nyc['arr_delay'])

And now I'm trying to find out the predicted arrival delay if the flights departure was delayed 15 minutes. How would I use the model above to predict what the arrival delay would be?

My first thought was to use a for loop / if statement, but then I came across .predict() and now I'm even more confused. Does .predict work like a boolean, where I would use "if departure delay is equal to 15, then arrival delay equals y"? Or is it something like:

arr_dep_model.predict(y)?

Solution

  • When working with LinearRegression models in sklearn you need to perform inference with the predict() function. But you also have to ensure the input you pass to the function has the correct shape (the same as the training data). You can learn more about the proper use of predict function in the official documentation.

    arr_dep_model.predict(youtInput)
    

    This line of code would output a value that the model predicted for a corresponding input. You can insert this into a for loop and traverse a set of values to serve as the model's input, it depends on the needs for your project and the data you are working with.