Search code examples
pythonnumpymatplotlibdata-sciencelinear-regression

Artificially increase one value and observe affects on another- Python


I have two columns where I would like to artificially increase column A value to 1000 to see what happens to values in column B.

Data

A           B
500         20
200         10
100         5

Desired

A           B
500        20
200        10
100        5
1000       ?

I wish to artificially increase column A value to 1000 to see what happens to values in column B.

Doing

Using python I will test for correlation. Treat this as linear regression problem.

pyplot.scatter(x = ‘A’, y = ‘B’, s= 100)
pyplot.show()

Then I am thinking I can use linear regression to determine what the value of B will be if I increase the dependent value of A. Just not sure on how to input the what if A values.

import numpy as np
from sklearn.linear_model import LinearRegression

x = np.array([500,200,100]).reshape((-1, 1))
y = np.array([20,10,5])

Any suggestion is appreciated


Solution

  • You first need to create and fit a model before you can use it to make predictions.

    import numpy as np
    from sklearn.linear_model import LinearRegression
    x = np.array([500,200,100])
    y = np.array([20,10,5])
    reg = LinearRegression().fit(x, y)
    reg.predict(np.array([1000]))
    

    A graph might help. In this case, there is no strict linear relationship, but we are making a best guess. It's sort of like a computer drawing a line of best fit.

    enter image description here

    Here, the equation of the line of best fit would be Y = 0.03654*X + 1.923. Making a prediction just means plugging another X value into this formula to find the corresponding Y coordinate.