I use patsy to build design matrix. I need to include powers of the original factors. For example, with the regression , I want to be able to write
patsy.dmatrix('y~x1 + x1**2 + x2 + x2**2 + x2**3', data)
where data is a dataframe that contains column y, x1, x2. But it does not seem to work at all. Any solutions?
Patsy has a special interpretation of **
that it inherited from R. I've considered making it automatically do the right thing when applied to numeric factors, but haven't actually implemented it... in the mean time, there's a general method for telling patsy to switch to using the Python interpretation of operators, instead of the Patsy interpretation: you wrap your expression in I(...)
. So:
patsy.dmatrix('y~x1 + I(x1**2) + x2 + I(x2**2) + I(x2**3)', data)