python stats models - quadratic term in regression
The simplest way is
model = sm.ols(formula = 'a ~ b + c + I(b**2)', data = data).fit()
The I(...)
basically says "patsy, please stop being clever here and just let Python handle everything inside kthx". (More detailed explanation)
Although the solution by Alexander is working, in some situations it is not very convenient. For example, each time you want to predict the outcome of the model for new values, you need to remember to pass both b**2 and b values which is cumbersome and should not be necessary. Although patsy does not recognize the notation "b**2", it does recognize numpy functions. Thus, you can use
import statsmodels.formula.api as sm
import numpy as np
data = {"a":[2, 3, 5], "b":[2, 3, 5], "c":[2, 3, 5]}
model = sm.ols(formula = 'a ~ np.power(b, 2) + b + c', data = data).fit()
In this way, latter, you can reuse this model without the need to specify a value for b**2
model.predict({"a":[1, 2], "b":[5, 2], "c":[2, 4]})