Search code examples
pythonmachine-learningsklearn-pandas

How to predict stock price for the next day with Python?


I'm trying to predict the stock price for the next day of my serie, but I don't know how to "query" my model. Here is my code in Python:

# Define my period
d1 = datetime.datetime(2016,1,1)
d2 = datetime.datetime(2016,7,1)

# Get the data
df = web.DataReader("GOOG", 'yahoo', d1, d2)
# Calculate some indicators
df['20d_ma'] = pandas.rolling_mean(df['Adj Close'], window=20)
df['50d_ma'] = pandas.rolling_mean(df['Adj Close'], window=50)

# Create the model
from sklearn.linear_model import LinearRegression
from sklearn.cross_validation import train_test_split

X = df[list(df.columns)[6:]] # Adj Close and indicators...
y = df['Adj Close']

X_train, X_test, y_train, y_test = train_test_split(X, y)

model = LinearRegression()
model.fit(X_train,y_train)

Ok, what I need is to query the model ( model.predict(..¿?..) ) to predict the stock price for the 'next' day.

How can I do it?

Thank's in advance!!!


Solution

  • model.predict(X_test) 
    

    Will do the job. And that's straight out of the wonderful documentation Do your basic reading before asking questions.

    Edit1: In response to comments, well then your feature engineering has problems. You cannot predict a value with a model (using features that you don't have the value for.). You'll have to go back and re-think why you picked those features and how they affect your outcome variable etc.

    Edit2: May be what you need to do is two models a time-series model on that 20d-avg to predict tommorrow's 20d-avg. and then use that to predict Stock price. I personally, think you wouldn't need the 2nd model if you can do the time-series model and get decent results.