Search code examples
pythonpandasstatisticsregressionstatsmodels

Statsmodels (Patsy) illegal variable name / 'Series' object is not callable Error


Update:

The error might have been caused by the fact that there is also a variable named "Q" in my dataset which conflicts the Q function. In this case, how do I elegantly solve it?


Update: You can download my dataset here.


I am running a simple OLS regression with statsmodels and pandas dataframe as following:

import statsmodels.formula.api as sm
import pandas as pd
df=pd.read_csv("exp.csv")
#df is a dataframe that I have containing many variable names such as AAPL, SPY, INF, etc.
for column in df: 
    result=sm.ols(formula="SPY"+" ~ "+column, data=df).fit()

However, one of the column name in df is INF. I guess maybe INF is a reserved word for pasty, the code gives me the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/base/model.py", line 155, in from_formula
    missing=missing)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/formula/formulatools.py", line 65, in handle_formula_data
    NA_action=na_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 310, in dmatrices
    NA_action, return_type)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 165, in _do_highlevel_design
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 62, in _try_incr_builders
    formula_like = ModelDesc.from_formula(formula_like)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 165, in from_formula
    value = Evaluator().eval(tree, require_evalexpr=False)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 400, in eval
    result = self._evaluators[key](self, tree)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 221, in _eval_any_tilde
    exprs = [evaluator.eval(arg) for arg in tree.args]    
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 400, in eval
    result = self._evaluators[key](self, tree)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 355, in _eval_number
    "only allowed with **", tree)
patsy.PatsyError: numbers besides '0' and '1' are only allowed with **
    SPY ~ INF
          ^^^

I have also tried using the Q function:

result=sm.ols(formula="SPY"+" ~ "+"Q('INF')", data=df).fit()

However, it gives me the following error instead:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/base/model.py", line 155, in from_formula
    missing=missing)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/formula/formulatools.py", line 65, in handle_formula_data
    NA_action=na_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 310, in dmatrices
    NA_action, return_type)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 165, in _do_highlevel_design
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 70, in _try_incr_builders
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/build.py", line 696, in design_matrix_builders
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/build.py", line 443, in _examine_factor_types
    value = factor.eval(factor_states[factor], data)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 566, in eval
    data)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 551, in _eval
    inner_namespace=inner_namespace)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/compat.py", line 36, in call_and_wrap_exc
    return f(*args, **kwargs)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 166, in eval
    + self._namespaces))
  File "<string>", line 1, in <module>
TypeError: 'Series' object is not callable

Any idea how to solve it?


Solution

  • I have solved the issue by ignoring the formula and use the direct interface instead:

    for column in df: 
        Y,X = df[column], df['SPY']
        X = sm.add_constant(X)
        result=sm.OLS(Y,X).fit()
    

    It looks to me that the interface still has some issues and is not easy to use.