Unexpected interquartile range (IQR) result

3 visualizzazioni (ultimi 30 giorni)
Sim
Sim il 9 Dic 2023
Commentato: Sim il 11 Dic 2023
For a number of distributions I would like to compare and show the interquartile range (IQR) and the standard deviation (STD).
For the normal distribution I got more or less what expected, i.e. the percentage of data within 1 STD, is around 68% of the distribution, and the IQR is around 50% of the distribution (i.e. the central half of the distribution). Here following my test:
clear all; clc;
samplesize = 100000;
% generate distribution
mu = 0;
sigma = 1;
data = normrnd(mu,repmat(sigma,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 68.1040
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50.1370
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off
However, if I try the same with another distribution, like a gamma one, the IQR is not 50% anymore of the distribution. What did I do wrong?
clear all; clc;
samplesize = 100000;
% generate distribution
a = 1;
b = 5;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.5350
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 100
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off

Risposte (1)

Sim
Sim il 9 Dic 2023
Modificato: Sim il 9 Dic 2023
my bad.. this is the solution:
dataIQR = data( data > q(1) & data < q(3) );
and the vertical lines related to the quartiles need to be replaced by this command:
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
This is a correct example:
% generate distribution
samplesize = 100000;
a = 1;
b = 8;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.3970
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data( data > q(1) & data < q(3) );
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50
% plot
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
  2 Commenti
Steven Lord
Steven Lord il 9 Dic 2023
You could check your results using the iqr function and/or the prctile function, each moved from Statistics and Machine Learning Toolbox to MATLAB in release R2022a.
Sim
Sim il 11 Dic 2023
Thanks a lot @Steven Lord for your nice comment and suggestion! :-) :-)

Accedi per commentare.

Categorie

Scopri di più su Statistics and Machine Learning Toolbox in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by