Update:
The error might have been caused by the fact that there is also a variable named "Q" in my dataset which conflicts the Q function. In this case, how do I elegantly solve it?
Update: You can download my dataset here.
I am running a simple OLS regression with statsmodels and pandas dataframe as following:
import statsmodels.formula.api as sm
import pandas as pd
df=pd.read_csv("exp.csv")
#df is a dataframe that I have containing many variable names such as AAPL, SPY, INF, etc.
for column in df:
result=sm.ols(formula="SPY"+" ~ "+column, data=df).fit()
However, one of the column name in df is INF
. I guess maybe INF
is a reserved word for pasty, the code gives me the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/base/model.py", line 155, in from_formula
missing=missing)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/formula/formulatools.py", line 65, in handle_formula_data
NA_action=na_action)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 310, in dmatrices
NA_action, return_type)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 165, in _do_highlevel_design
NA_action)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 62, in _try_incr_builders
formula_like = ModelDesc.from_formula(formula_like)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 165, in from_formula
value = Evaluator().eval(tree, require_evalexpr=False)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 400, in eval
result = self._evaluators[key](self, tree)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 221, in _eval_any_tilde
exprs = [evaluator.eval(arg) for arg in tree.args]
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 400, in eval
result = self._evaluators[key](self, tree)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 355, in _eval_number
"only allowed with **", tree)
patsy.PatsyError: numbers besides '0' and '1' are only allowed with **
SPY ~ INF
^^^
I have also tried using the Q function:
result=sm.ols(formula="SPY"+" ~ "+"Q('INF')", data=df).fit()
However, it gives me the following error instead:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/base/model.py", line 155, in from_formula
missing=missing)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/formula/formulatools.py", line 65, in handle_formula_data
NA_action=na_action)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 310, in dmatrices
NA_action, return_type)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 165, in _do_highlevel_design
NA_action)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 70, in _try_incr_builders
NA_action)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/build.py", line 696, in design_matrix_builders
NA_action)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/build.py", line 443, in _examine_factor_types
value = factor.eval(factor_states[factor], data)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 566, in eval
data)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 551, in _eval
inner_namespace=inner_namespace)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/compat.py", line 36, in call_and_wrap_exc
return f(*args, **kwargs)
File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 166, in eval
+ self._namespaces))
File "<string>", line 1, in <module>
TypeError: 'Series' object is not callable
Any idea how to solve it?
I have solved the issue by ignoring the formula and use the direct interface instead:
for column in df:
Y,X = df[column], df['SPY']
X = sm.add_constant(X)
result=sm.OLS(Y,X).fit()
It looks to me that the interface still has some issues and is not easy to use.