Fitting Gaussian to a curve with multiple peaks

I have a curve with a few peaks (the figure is attached here). I want to fit a few Gaussian to it. How can I do that?

2 Commenti

There are pre-built models in the Curve Fitting Toolbox if you have it...
good question

Accedi per commentare.

 Risposta accettata

Attached is a demo for how to fit any specified number of Gaussians to noisy data. Here is an example where I created a signal from 6 component Gaussians by summing then, and then added noise to the summed curve. The input data is the dashed line (upper most curve), and the Gaussians it thought would sum to fit it best are shown in the solid color curves below the dashed curve.
Here is the main part of the program:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% HEAVY LIFTING DONE RIGHT HERE:
% Run optimization
[parameter, fval, flag, output] = fminsearch(@(lambda)(fitgauss(lambda, tFit, y)), startingGuesses, options);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The rest is just setup/initialization and plotting code.

10 Commenti

edoardo amarotti
edoardo amarotti il 25 Nov 2020
Modificato: edoardo amarotti il 25 Nov 2020
Does this code run also if there's an exponential baseline??
This is what I would like to fit
thx
Yes, it should. Did you try it? I'd probably specify 4 Gaussians, or however many you think should be there theoretically.
There are lots of baseline correction algorithms if you want to use one before you fit it though.
because I run the code with 4 gaussian but they doesn't fit my data...
I attached my data, how may I have a good quality fit from your code?
Which baseline corretion algorithm do you suggested me?
thx in advance
Image Analyst
Image Analyst il 25 Nov 2020
Modificato: Image Analyst il 25 Nov 2020
Actually I think 6 was better. See attached script and adapt as needed.
Start your own thread/discussion/question if you have additional questions, and put the link to it back here.
Thor
Thor il 22 Feb 2021
Modificato: Thor il 22 Feb 2021
Hi @Image Analyst, I got a similar issue. I got some UV data and i want to fit a gaussian to the major peak. I have attempted to use your code, but the fitting is not as nice as the one your have shown above. Please , could you provide me with some guidance? I would be happy to open a question. Many thanks.
Thanks for your demo. I recently try to fit Gaussians to my data ( my data is a spectrum) , and I saw the question you answered (https://www.mathworks.com/matlabcentral/fileexchange/74408-fit-multiple-gaussians?tab=discussions#discussions_2483348 ).
You said that just replace
y = y + noiseAmplitude * (rand(size(y)) - 0.5);
to my data, and it will work.
But I only have x_data and y_data, there is not a function, could you provide me with some guidance ? I appreciate your help very much.
@YI-CHEN CHANG just get rid of that line and use y_data everywhere instead of y.
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
Post your data and m-file in a new question, not here.
Ian-Andreas
Ian-Andreas il 19 Feb 2024
Modificato: Ian-Andreas il 19 Feb 2024
@Image Analyst I am doing something similar with absorption spectra and stumbled upon your solution which has greatly helped me. One problem which I have with fitting gaussians to it is that some of the gaussians havea negative amplitude, which I need to avoid. Do you know of a way to specify the code so that the sum of gaussians includes only the gaussians with positive amplitudes?
Hopefully my question is clear and hopefully you or someone else can help.
Thanks in advance!
@Ian-Andreas I'm not sure. What I would try is to change the number of Gaussians fitted until you get a situation where there no negative pointing Gaussians.
@Image Analyst Cheers, adding the number of gaussians appears to have solved it!

Accedi per commentare.

Più risposte (2)

Image Analyst
Image Analyst il 21 Lug 2018
Gaussian Mixture Models
Gaussian mixture models (GMM) are composed of k multivariate normal density components, where k is a positive integer. Each component has a d-dimensional mean (d is a positive integer), d-by-d covariance matrix, and a mixing proportion. Mixing proportion j determines the proportion of the population composed by component j, j = 1,...,k.
You can fit a GMM using the Statistics and Machine Learning Toolbox™ function fitgmdist by specifying k and by supplying X, an n-by-d matrix of data. The columns of X correspond to predictors, features, or attributes, and the rows correspond to observations or examples. By default, fitgmdist fits full covariance matrices that are different among components (or unshared).

13 Commenti

Thanks for the response. My problem is that I don't have data points which might come from multiple gaussian distribution. As it is shown in the figure I only have those curves.
That looks like a ML figure; if so retrieve the data from it...
hAx=gca; % handle to the axes
hL=hAx.Children; % retrieve the line handle (child of axes)
X=hL.XData; % then get the data of the line...
Y=hl.YData;
I know the value of the data. My problem is that they are not coming from a probability distributions. Meaning that the area under this curve is not one. I have one vector for X which is the angle and it is [-180 180]. And the other is the value of that curve which is in y axis. So how can I fit a few Gaussian curves to this curve?
I know MATLAB can take a signal and decompose it into some specified number of Gaussians and tell you their means and standard deviations, but honestly, I have not done it myself so I can't guide you through it. If you figure it out, post the code here.
Well, attach the .mat file containing the data, then...but what happens if you just use
pks=fit(X,Y,'gauss5');
? That's taking each bump as a potential peak...you might have better luck first fitting the two dominant and stripping before fitting the residuals; that's what we did in old SulfurMeter with gamma-spec using NaI which are similarly-looking low resolution peaks.
MINA
MINA il 23 Lug 2018
Modificato: MINA il 23 Lug 2018
Here is my matlab code. It does not give me good answers. Also attached is another figure along with its data.
N=10000;
Indices = crossvalind('Kfold', N, 5);
for S=1:6
modl=['gauss',num2str(S)];
for i=1:5
I_test=Indices==i;
[ fitobject_train{S,i},gof_train{S,i},output_train{S,i}] = fit(x2(~I_test'),y2(~I_test),modl);
y_test = feval(fitobject_train{S,i},x2(I_test));
RMSE(S,i) = sqrt(mean((y2(I_test) - y_test).^2));
end
end
rms=nanmean(RMSE,2);
[~,best_mdl]=min(rms);
dpb, historical note, that is also the strategy astronomers used in finding stars in the "Clean" algorithm. Take largest peak, fit it, subtract, and then iterate on all the smaller remaining peaks. I suppose it works okay empirically but I gotta think there's a fancier, mathematically accurate way.
MINA
MINA il 23 Lug 2018
Modificato: MINA il 23 Lug 2018
I have another code as well
AIC = zeros(1,10);
obj = cell(1,10);
for K = 1:10
obj{K} = gmdistribution.fit(y/sum(y),K);
AIC(K)= obj{K}.AIC;
end
[minAIC,numComponents] = min(AIC);
This one also doesn't give me any good answers...
@IA--Interesting! Thanks for sharing, enjoy those tidbits.
Our stripping was quite a bit more than just subtracting the raw fit although that was the base starting point; we had MC simulations of the detector system response curves that were used to control the fitting process. With those there were also libraries of measurements of known coal types and S levels that were used to help in the calibration problem.
dpb
dpb il 23 Lug 2018
Modificato: dpb il 23 Lug 2018
@MINA -- I'll try to play some this evening but I'm not at all surprised just plain fitting isn't all that good...there's not much definition in the first for more than two at most and even the second is badly overlaid by another or other contamination.
Unless you have some additional knowledge similar to the ideas of the S-M of knowing something about what the answer has to be and the measurement system characteristics, I suspect you'll not have much luck.
What are the data measurements of; knowing that might give some klews...
PS. Didn't see the first data set attached??? The second is near hopeless without setting some fixed parameters probably; the chances of OLS managing to separate the one middle blob into two is nil.
W/ the S measurement, the energies of the peaks of interest are known so there's a pretty good head start in what the answer has to be.
This is not a physical measurements. I had fitted a model through GLM to my data and these are the kernels that GLM gives me. Now I want to find the peaks of these kernels.
Hi,MINA, refer to the results below of multi-peak fit, looks perfect:
eq.png
Root of Mean Square Error (RMSE): 0.000758421176625326
Sum of Squared Residual: 0.00575202681153745
Correlation Coef. (R): 0.99999631975421
R-Square: 0.999992639521965
Adjusted R-Square: 0.999992638049428
Determination Coef. (DC): 0.999992638461578
Chi-Square: 0.0026447203781342
F-Statistic: 58916561.8092515
Parameter Final Estimate
---------- --------------
y01 3.05300628204444
a1 -16.9598254082349
w1 -33.1216274304912
xc1 -199.237682046936
y02 -5.01995553438715
a2 125.991592415662
w2 68.1014557034446
xc2 -38.1447385704029
y03 -21670.4124112499
a3 -862.748820039932
w3 -150.593954759166
xc3 -158.633069857899
y04 2.38145165713579
a4 71.7219094008726
w4 56.7351551616516
xc4 21.6923380713498
y05 -23.3097952825874
a5 -148.854281788758
w5 -73.9490093077971
xc5 220.960836388443
y06 21689.0835365582
a6 -1739.80097056373
w6 -275.612137225476
xc6 104.20899470954
c235.jpg
Comment posted as flag by qi lu:
Hi, this demo seems perfect. Could you please tell me how to fit such equation. I use 'fit' in the curve fitting box, the fitting result is horrible.

Accedi per commentare.

Image Analyst
Image Analyst il 24 Lug 2018
Modificato: Image Analyst il 25 Nov 2020
One way to do it is to set up the sum of two Gaussians with an offset and a linear ramp. Then you can use fitnlm, with your best guesses as to the parameters. See code:
% Uses fitnlm() to fit a non-linear model (sum of two gaussians on a ramp) through noisy data.
% Requires the Statistics and Machine Learning Toolbox, which is where fitnlm() is contained.
% Initialization steps.
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 20;
% Create the X coordinates from 0 to 20 every 0.5 units.
X = linspace(0, 80, 400);
mu1 = 10; % Mean, center of Gaussian.
sigma1 = 3; % Standard deviation.
mu2 = 50; % Mean, center of Gaussian.
sigma2 = 9; % Standard deviation.
% Define function that the X values obey.
a = 6 % Arbitrary sample values I picked.
b = 35
c = 25
m = 0.1
Y = a + m * X + b * exp(-(X - mu1) .^ 2 / sigma1)+ c * exp(-(X - mu2) .^ 2 / sigma2); % Get a vector. No noise in this Y yet.
% Add noise to Y.
Y = Y + 1.2 * randn(1, length(Y));
% Now we have noisy training data that we can send to fitnlm().
% Plot the noisy initial data.
plot(X, Y, 'b.', 'LineWidth', 2, 'MarkerSize', 15);
grid on;
% Convert X and Y into a table, which is the form fitnlm() likes the input data to be in.
tbl = table(X', Y');
% Define the model as Y = a + exp(-b*x)
% Note how this "x" of modelfun is related to big X and big Y.
% x((:, 1) is actually X and x(:, 2) is actually Y - the first and second columns of the table.
modelfun = @(b,x) b(1) + b(2) * x + b(3) * exp(-(x(:, 1) - b(4)).^2/b(5)) + b(6) * exp(-(x(:, 1) - b(7)).^2/b(8));
beta0 = [6, 0.1, 35, 10, 3, 25, 50, 9]; % Guess values to start with. Just make your best guess.
% Now the next line is where the actual model computation is done.
mdl = fitnlm(tbl, modelfun, beta0);
% Now the model creation is done and the coefficients have been determined.
% YAY!!!!
% Extract the coefficient values from the the model object.
% The actual coefficients are in the "Estimate" column of the "Coefficients" table that's part of the mode.
coefficients = mdl.Coefficients{:, 'Estimate'}
% Let's do a fit, but let's get more points on the fit, beyond just the widely spaced training points,
% so that we'll get a much smoother curve.
X = linspace(min(X), max(X), 1920); % Let's use 1920 points, which will fit across an HDTV screen about one sample per pixel.
% Create smoothed/regressed data using the model:
yFitted = coefficients(1) + coefficients(2) * X + coefficients(3) * exp(-(X - coefficients(4)).^2 / coefficients(5)) + coefficients(6) * exp(-(X - coefficients(7)).^2 / coefficients(8));
% Now we're done and we can plot the smooth model as a red line going through the noisy blue markers.
hold on;
plot(X, yFitted, 'r-', 'LineWidth', 2);
grid on;
title('Exponential Regression with fitnlm()', 'FontSize', fontSize);
xlabel('X', 'FontSize', fontSize);
ylabel('Y', 'FontSize', fontSize);
legendHandle = legend('Noisy Y', 'Fitted Y', 'Location', 'northeast');
legendHandle.FontSize = 25;
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')

6 Commenti

I was very pleased at how this turned out! Make obvious modifications for 3 or 4 Gaussians instead of 2.
What if you try occluding the two, though...can it manage to sort them out? :)
No, if the humps get two close together it doesn't work well. I'd be interested to hear how it worked with MINA's data, which is not attached (yet).
What if to adapt this code for 2D image? I am trying to fit multiple derivative of Gaussians with x direction in a image. Below is an example that I would like to fit for one Gaussian (derivative in x direction). I attached my data, how may I have a good fit from your code?
Thanks in advance.
I haven't done it in 2-D. You'd have to experiment around with it and try to adapt it to 2-D. You can do that as well as I can.
@Nan Tao, if your real goal is get the partial derivate from data and the data are quite smooth
(no noise), then simple numerical differentiation (finite-difference) shold work fine.
If you really want to fit a model form your data then compute the partial derivate of the fitted model, you
may try this : nonlinear surface fitting, here's the result I got,
the fitted model match the data you provide quite well.
mean square error : 6.6370e-13
mean absolute error: 5.4212e-07
partial derivate estimation
Code: (fitting)
clear; clc; close all;
load('I0_Gaus_Gradient_Fit_0226.mat');
Z=I0_Gaus_Fit_0226;
% to run this script, download the toolbox at file exchage: https://tinyurl.com/wre9r5uk
count=0;
for i=1:size(Z,1)
for j=1:size(Z,2)
count=count+1;
X(1,count)=i;
X(2,count)=j;
Y(count)=Z(i,j);
end
end
%% set up model
NN.InputAutoScaling='on';
NN.LabelAutoScaling='on';
InSize=2; OutSize=1;
LayerStruct=[InSize,5,5,5,OutSize];
NN=Initialization(LayerStruct,NN);
%% set up optimizer
option.MaxIteration=500;
NN=OptimizationSolver(X,Y,NN,option);
Compute partial derivate & visulization
%% visualization
R=FittingReport(X,Y,NN);
P=NN.Evaluate(X);
PMatrix=reshape(P,size(Z,1),size(Z,2));
s=surf(PMatrix);
s.EdgeColor='none';
title(' fitted surface, $$ z=f(x,y) $$,','interpreter','latex')
%% partial derivate estimation
D=NN.Derivate(X);
dx=reshape(D(1,:),size(Z));
dy=reshape(D(2,:),size(Z));
figure
sdx=surf(dx); title('$$ \frac{\partial z}{\partial x} $$','interpreter','latex'); sdx.EdgeColor='none';
figure
sdy=surf(dy); title('$$ \frac{\partial z}{\partial y}$$','interpreter','latex'); sdy.EdgeColor='none';

Accedi per commentare.

Richiesto:

il 21 Lug 2018

Commentato:

il 21 Apr 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by