Calculating Slopes in Numpy (or Scipy)

The linear regression calculation is, in one dimension, a vector calculation. This means we can combine the multiplications on the entire Y matrix, and then vectorize the fits using the axis parameter in numpy. In your case that works out to the following

((X*Y).mean(axis=1) - X.mean()*Y.mean(axis=1)) / ((X**2).mean() - (X.mean())**2)

You're not interested in fit quality parameters but most of them can be obtained in a similar manner.


A representation that's simpler than the accepted answer:

x = np.linspace(0, 10, 11)
y = np.linspace(0, 20, 11)
y = np.c_[y, y,y]

X = x - x.mean()
Y = y - y.mean()

slope = (X.dot(Y)) / (X.dot(X))

The equation for the slope comes from Vector notation for the slope of a line using simple regression.


The fastest and the most efficient way would be to use a native scipy function from linregress which calculates everything:

slope : slope of the regression line

intercept : intercept of the regression line

r-value : correlation coefficient

p-value : two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero

stderr : Standard error of the estimate

And here is an example:

a = [15, 12, 8, 8, 7, 7, 7, 6, 5, 3]
b = [10, 25, 17, 11, 13, 17, 20, 13, 9, 15]
from scipy.stats import linregress
linregress(a, b)

will return you:

LinregressResult(slope=0.20833333333333337, intercept=13.375, rvalue=0.14499815458068521, pvalue=0.68940144811669501, stderr=0.50261704627083648)

P.S. Just a mathematical formula for slope:

enter image description here


This clear one-liner should be efficient enough without scipy:

slope = np.polyfit(X,Y,1)[0]

Finally you should get

import numpy as np

Y = np.array([
    [  2.62710000e+11, 3.14454000e+11, 3.63609000e+11, 4.03196000e+11, 4.21725000e+11, 2.86698000e+11, 3.32909000e+11, 4.01480000e+11, 4.21215000e+11, 4.81202000e+11],
    [  3.11612352e+03, 3.65968334e+03, 4.15442691e+03, 4.52470938e+03, 4.65011423e+03, 3.10707392e+03, 3.54692896e+03, 4.20656404e+03, 4.34233412e+03, 4.88462501e+03],
    [  2.21536396e+01, 2.59098311e+01, 2.97401268e+01, 3.04784552e+01, 3.13667639e+01, 2.76377113e+01, 3.27846013e+01, 3.73223417e+01, 3.51249997e+01, 4.42563658e+01]]).T
X = [ 1990,  1991,  1992,  1993,  1994,  1995,  1996,  1997,  1998,  1999] 

print np.polyfit(X,Y,1)[0]

Output is [1.54983152e+10 9.98749876e+01 1.84564349e+00]