Create a Rayleigh distribution fitting experimental data

Question

DANIELE PASINI il 16 Gen 2022

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1629970-create-a-rayleigh-distribution-fitting-experimental-data

Risposto: William Rose il 18 Gen 2022

Hi, i'm trying to fit a rayleigh distribution to experimental data, but even if I've found the optimal parameter B for the distribution, it results in a completely different one.

I've tried using histfit (which works but I can't use in my assignment), makedist and the distributionFitter app.

Here's the code i'm using:

peaks=findpeaks(Data(:,i),'MinPeakHeight',0.01);

peaks=sort(peaks);

B=raylfit(peaks); %find parameter for Rayleigh distribution

ray=makedist('Rayleigh','B',B);

raydist=pdf(ray,peaks);

peaksdist=histcounts(peaks,1000,'Normalization','pdf'); %experimental distribution

hold on

x=linspace(0,max(peaks),length(raydist));

plot(x,raydist,'LineWidth',2)

When using the distributionFitter app I get this result instead:

But the parameter B used is the same in both (0.031).

How can I obtain the same distribution using the makedist function?

2 Commenti
Mostra NessunoNascondi Nessuno

William Rose il 16 Gen 2022

@DANIELE PASINI, can you provide the data please?

DANIELE PASINI il 16 Gen 2022

Cartel1.xlsx

In the attachment is the first column of the data, the same I'm using to try the script

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

William Rose il 17 Gen 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1629970-create-a-rayleigh-distribution-fitting-experimental-data#answer_875665

Apri in MATLAB Online

Cartel1.mat

@DANIELE PASINI,

I plotted your data.

name='Cartel1';

data=load(name);

Data=data.data; %extract array 'data' from the structure 'data'

N=length(Data);

figure; subplot(311), plot(1:N,Data); xlabel('Sample'); title(name);

subplot(312), plot(1:5000,Data(1:5000)); xlabel('Sample');

subplot(313), plot(1:500,Data(1:500)); xlabel('Sample');

Now I understand why you want to model the distribution of peak heights. The data is a recording which shows bursts of activity. It is approximately symmetric about zero, like an electromyogram. Perhaps the peak height distribution will be useful for comparing recordings from different experiments.

Therefere let us find all the positive and negative peaks (i.e. peaks of -Data). Let us plot their histograms separately.

%find positive and negative peaks

peakpos=findpeaks(Data,'MinPeakHeight',0.01);

peakneg=findpeaks(-Data,'MinPeakHeight',0.01);

fprintf('Found %d positive peaks and %d negative peaks. \n',length(peakpos),length(peakneg));

Found 3383 positive peaks and 3335 negative peaks.

%plot superimposed histograms of the positive and negative peak heights

figure;

h1p=histogram(peakpos,'FaceColor','g'); hold on

h1n=histogram(peakneg,'FaceColor','r');

legend('Positive peaks','Negative Peaks'); title('Histogram');

Since the histograms are very similar, we combine the sets into a single set of peak heights, :

%The histograms show that the positive and negative peaks appear to have
%the same distribution, so we combine them into a single set.
peaks=[peakpos;peakneg]; Npk=length(peaks);

Now fit the peak heights with a Rayleigh distribution.

bhat=raylfit(peaks);          %sqrt of max.likelihood estimate of b^2
fprintf('Best fit Rayleigh parameter: bhat=%.4f.\n',bhat);
Best fit Rayleigh parameter: bhat=0.0310.

Plot the histogram of peak heights and the corresponding values expected from the best fit.

figure;

h2=histogram(peaks);

BinCenters=(h2.BinEdges(1:end-1)+h2.BinEdges(2:end))/2;

numExp=Npk*h2.BinWidth*raylpdf(BinCenters,bhat); %expected number at each bin

hold on; grid on

ylabel('Number of Occurences');

titlestr=sprintf('$Histogram, N=%d, \\hat{b}=%.4f$',N,bhat);

title(titlestr,'Interpreter','latex');

plot(BinCenters,numExp,'-r.'); hold off

Interesting. The peak height distribution is clearly not from a Rayleigh distribution.

The theoretical ratio of mean to standard deviation, for a random variable with a Rayleigh distribution, is

.

The ratio is 1.109 for peaks, which is not close to the theoretical value, if peaks has a Rayleigh distribution. This is more evidence for the non-Rayleighness of peaks.

2 Commenti
Mostra NessunoNascondi Nessuno

William Rose il 17 Gen 2022

RayleighFitTest.m

@DANIELE PASINI, The Weibull distribution, the Gamma distribution, and the generalized Pareto distribution fit smewhat better than Rayeigh. However, the plots below show that none of these distributions are correct for this data. The code which makes the plot below is attached. It loads the file Cartel1.mat which was attached to my answer above.

DANIELE PASINI il 17 Gen 2022

I'm not completely sure how or why this works, but it does, thank you very much!

I know that the two distributions don't work for my data, but that's what I need to do, try to fit both and comment on them.

If I can ask, could you explain why you used

numExp=Npk*h2.BinWidth*raylpdf(BinCenters,bhat);

to get the distribution and why my previous code didn't work (the one using makedist('Rayleigh','B',B); raydist=pdf(ray,peaks);)?

But either way thank you very much!

Accedi per commentare.

Answer 2

the cyclist il 16 Gen 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1629970-create-a-rayleigh-distribution-fitting-experimental-data#answer_875470

Apri in MATLAB Online

According to the documentation for raylfit, it takes the raw data as input. Is there a reason you are not simply using

B=raylfit(Data(:,i));

to find the fit parameter?

5 Commenti
Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

the cyclist il 16 Gen 2022

Apri in MATLAB Online

Here are some things I have observed in your data:

There are 65,536 data points in the original set
32,722 of those -- HALF! -- are positive, so these are not just tiny anomalies
After using peaks, there are only 3,383 points

I am pretty certain that you are throwing away a lot of data that you do not intend to. I don't think peaks does what you think it does. (It does not just trim away the very small and negative values.)

What @Image Analyst sugggests might be what you want, but I have no idea why you think you should be fitting a Rayleigh distribution to data in which half the data are negative.

DataTable = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/864220/Cartel1.xlsx');

histogram(DataTable{:,1},75)

DANIELE PASINI il 16 Gen 2022

I need to make an analysis of the experimental distribution of the peaks value. Since later on I will need to also fit a Weibull distribution I've used findpeaks to find the peaks that are positive only and also eliminate very small values (maybe I'm wrong, but I will see further on the project). So I intended to eliminate those values. What I'm asking is basically why the two graphs I've posted are different even though I'm using the same parameter for the distribution? The same happens when I tried with the Weibull distribution.

Accedi per commentare.

Answer 3

William Rose il 16 Gen 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1629970-create-a-rayleigh-distribution-fitting-experimental-data#answer_875605

Modificato: William Rose il 16 Gen 2022

Apri in MATLAB Online

@DANIELE PASINI, here is code with simulated data to demonstrate. As others have said, your data may not be appropriate for Rayleigh (whih is square root of the sum of two independent normal random variables). Your samples should be in vector x. Compute N=number of data points before computing numExp.

I did not use histfit, since you said it was not allowed. I did use raylfit(). If raylfit() is not allowed, then compute the maximum likelihood estimate for b^2, and take the square root:

bsqdhat=dot(x,x)/(2*N); . %MLE for b^2

The square toot of bsqdhat is not necessarily a MLE for b, and it may not be unbiased, but this seems to be what raylfit() does, because the result is the same to 4 decimal places, whenever I compare the estimates.

%RayleighFitTest.m

%WCR 20220116

%Generate simulated data to fit.

%Replace this section with your data.

b=0.031;

N=1000; %number of samples

x=b*sqrt(-2*log(rand(1,N))); %x=vector of samples whose distribution we wish to esitmate

%fit the data

bhat=raylfit(x); %Matlab's fit

bsqdhat=dot(x,x)/(2*N); %MLE for b^2:

fprintf('Best fit Rayleigh parameter: bhat=%.4f, sqrt(b^2hat)=%.4f\n',bhat,sqrt(bsqdhat));

Best fit Rayleigh parameter: bhat=0.0308, sqrt(b^2hat)=0.0308

%plot the histogram of the data, and the number expected from the fit

h=histogram(x);

BinCenters=(h.BinEdges(1:end-1)+h.BinEdges(2:end))/2;

numExp=N*h.BinWidth*raylpdf(BinCenters,b); %expected number at each bin

hold on; grid on

ylabel('Number of Occurences');

titlestr=sprintf('$Histogram, N=%d, \\hat{b}=%.4f$',N,bhat);

title(titlestr,'Interpreter','latex');

plot(BinCenters,numExp,'-bx');

Good luck!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Answer 4

William Rose il 18 Gen 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1629970-create-a-rayleigh-distribution-fitting-experimental-data#answer_876350

Modificato: William Rose il 18 Gen 2022

@DANIELE PASINI,

[I edited this comment to correct typographical errors.]

histogram(peaks) plots the number of occurences of values within the range of x-values for that bin. The expected number of counts for that bin is the integral of the probability density function over the bin, times the total number of samples.

Since each bin is reasonably narrow, we can approximate the integral with the midpoint value times the width:

where p(x)=probability density at x, and

, and

= bin width. Therefore

.

The command

raydist=pdf(ray,peaks);

does not give the correct result because it does not multiply by the number of samples, and does not multiply by the bin width, and because the argument peaks is not the correct argument to use. Instead of peaks, you should pass the x-values at which you want to evaluate the probability density function. And those desired x values are the bin center points. You can see and understand from my code how I computed the bin center points.

I hope your work goes well. What is the source of your signal? Seismogram? Biological recording?

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

DANIELE PASINI il 18 Gen 2022

Thank you very much, now it is much clearer!

It's a signal of accelerometers put on a bridge

Accedi per commentare.

Answer 5

William Rose il 18 Gen 2022

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1629970-create-a-rayleigh-distribution-fitting-experimental-data#answer_876745

@DANIELE PASINI, very interesting. The autocorrelation shows some ringing at a period of about 16 samples, i.e. some kind of resonance at that frequency.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Create a Rayleigh distribution fitting experimental data

2 Commenti
Mostra NessunoNascondi Nessuno

Risposta accettata

2 Commenti
Mostra NessunoNascondi Nessuno

Più risposte (4)

5 Commenti
Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

Create a Rayleigh distribution fitting experimental data

2 Commenti Mostra NessunoNascondi Nessuno

Risposta accettata

2 Commenti Mostra NessunoNascondi Nessuno

Più risposte (4)

5 Commenti Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

2 Commenti
Mostra NessunoNascondi Nessuno

2 Commenti
Mostra NessunoNascondi Nessuno

5 Commenti
Mostra 3 commenti meno recentiNascondi 3 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti