Was following Sentdex Machine Learning Tutorials in youtube. In the 5th part he does this
forecast_out = int(math.ceil(0.01*len(df)))
print(forecast_out)
df['label'] = df[forecast_col].shift(-forecast_out)
X = np.array(df.drop(['label'],1))
X = preprocessing.scale(X)
X = X[:-forecast_out]
X_lately = X[-forecast_out:]
df.dropna(inplace=True)
y = np.array(df['label'])
y = np.array(df['label'])
I got completely lost what he was trying to do here. In int(math.ceil(0.01*len(df)))
he was trying to get the number of days he wants to find the prediction of. After that, he did df[forecast_col].shift(-forecast_out)
and i couldn't anything after that.
There is not enough information here, but if this is a time series forecasting problem, then what I think is that df[forecast_col].shift(-forecast_out)
shifts the forecast column up for 'forecast_out' number of days so that the label column for a specific day would be the number you need to forecast (which is, the number shifted from the future).