fit multiple gaussians to the data in python
@john1024's answer is good, but requires a manual process to generate the initial guess. here's an easy way to automate the starting guess. replace the relevant 3 lines of john1024's code by the following:
import scipy.signal
i_pk = scipy.signal.find_peaks_cwt(y, widths=range(3,len(x)//Npks))
DX = (np.max(x)-np.min(x))/float(Npks) # starting guess for component width
guess = np.ravel([[x[i], y[i], DX] for i in i_pk]) # starting guess for (x, amp, width) for each component
This requires a non-linear fit. A good tool for this is scipy's curve_fit
function.
To use curve_fit
, we need a model function, call it func
, that takes x
and our (guessed) parameters as arguments and returns the corresponding values for y
. As our model, we use a sum of gaussians:
from scipy.optimize import curve_fit
import numpy as np
def func(x, *params):
y = np.zeros_like(x)
for i in range(0, len(params), 3):
ctr = params[i]
amp = params[i+1]
wid = params[i+2]
y = y + amp * np.exp( -((x - ctr)/wid)**2)
return y
Now, let's create an initial guess for our parameters. This guess starts with peaks at x=0
and x=1,000
with amplitude 60,000 and e-folding widths of 80. Then, we add candidate peaks at x=60, 140, 220, ...
with amplitude 46,000 and width of 25:
guess = [0, 60000, 80, 1000, 60000, 80]
for i in range(12):
guess += [60+80*i, 46000, 25]
Now, we are ready to perform the fit:
popt, pcov = curve_fit(func, x, y, p0=guess)
fit = func(x, *popt)
To see how well we did, let's plot the actual y
values (solid black curve) and the fit
(dashed red curve) against x
:
As you can see, the fit is fairly good.
Complete working code
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('data.txt', delimiter=',')
x, y = data
plt.plot(x,y)
plt.show()
def func(x, *params):
y = np.zeros_like(x)
for i in range(0, len(params), 3):
ctr = params[i]
amp = params[i+1]
wid = params[i+2]
y = y + amp * np.exp( -((x - ctr)/wid)**2)
return y
guess = [0, 60000, 80, 1000, 60000, 80]
for i in range(12):
guess += [60+80*i, 46000, 25]
popt, pcov = curve_fit(func, x, y, p0=guess)
print popt
fit = func(x, *popt)
plt.plot(x, y)
plt.plot(x, fit , 'r-')
plt.show()