scipy linregress function erroneous standard error return?

I've just been informed by the SciPy user group that the std_err here represents the standard error of the gradient line, not the standard error of the predicted y's, as per Excel. Nevertheless users of this function should be careful, because this was not always the behaviour of this library - it used to output exactly as Excel, and the changeover appears to have occurred in the past few months.

Anyway still looking for an equivalent to STEYX in Python.


You could try the statsmodels package:

In [37]: import statsmodels.api as sm

In [38]: x = [5.05, 6.75, 3.21, 2.66]

In [39]: y = [1.65, 26.5, -5.93, 7.96]

In [40]: X = sm.add_constant(x) # intercept

In [41]: model = sm.OLS(y, X)

In [42]: fit = model.fit()

In [43]: fit.params
Out[43]: array([  5.39357736, -16.28112799])

In [44]: fit.rsquared
Out[44]: 0.52480627513624789

In [45]: np.sqrt(fit.mse_resid)
Out[45]: 11.696414461570097

yes this is true - the standard estimate of the gradient is what linregress returns; the standard estimate of the estimate (Y) is related, though, and you can back-into the SEE by multiplying the standard error of the gradient (SEG) that linregress gives you: SEG = SEE / sqrt( sum of (X - average X)**2 )

Stack Exchange doesn't handle latex but the math is here if you are interested, under the "Analyze Sample Data" heading.