How to center the bars of the histogram over the elements of the input array?

75 visualizzazioni (ultimi 30 giorni)
How to center the bars of the histogram at the elements of the input array?
I mean, in the following example I would like that the middle point of the each bar basis matches the corresponding element in the array x. For istance, in the following example the leftmost bar should be centered at 1 and not at 1.5.
x =[3 5 1 5 4 1 2 5 3 4];
histogram(x,'binwidth',1)
xlabel('x')
ylabel('frequency')
The bars are centered if I remove the option binwidth, and, in general, the histogram is correct:
x =[3 5 1 5 4 1 2 5 3 4];
histogram(x)
xlabel('x')
ylabel('frequency')

Risposta accettata

Steven Lord
Steven Lord il 8 Nov 2023
thanks a lot @Star Strider, but why do we get frequency = 5 for x = 4 ?
Shouldn't it be frequency = 2 for x = 4?
BinWidth and BinEdges
No. If you look at the description of the edges input on the histogram documentation page, the key point is:
"Each bin includes the leading edge, but does not include the trailing edge, except for the last bin which includes both edges."
This also appears (rewritten slightly) in the description of the BinEdges property of the histogram graphics object.
If I recall correctly, when you specify BinWidth MATLAB uses the colon operator and the minimum and maximum values in your data set. If the last element of the vector of bin edges created by colon falls short of the maximum value, it adds one last bin. Let's see what happens if we add a number just barely greater than 5 to your vector:
x =[3 5 1 5 4 1 2 5 3 4 5+1e-6];
bw = 1;
[counts1,edgs1,bin1] = histcounts(x, 'BinWidth', bw)
counts1 = 1×5
2 1 2 2 4
edgs1 = 1×6
1 2 3 4 5 6
bin1 = 1×11
3 5 1 5 4 1 2 5 3 4 5
The candidate BinEdges vector was:
v = min(x):bw:max(x)
v = 1×5
1 2 3 4 5
Does it "capture" all the elements in x?
v(end) >= max(x)
ans = logical
0
No, so we need one more bin. You can see that in the edgs1 vector above. Otherwise the last element of x wouldn't be in any bin, as you can see from bin2 below, and the last bin would contain both x = 4 and x = 5.
[counts2,edgs2,bin2] = histcounts(x, 'BinEdges', v)
counts2 = 1×4
2 1 2 5
edgs2 = 1×5
1 2 3 4 5
bin2 = 1×11
3 4 1 4 4 1 2 4 3 4 0
x(bin2 == 4)
ans = 1×5
5 5 4 5 4
In your case, the last element of the colon vector lands exactly on the maximum value of your data and so the last bin will include elements in x in the closed interval [4, 5] unlike the other bins which are half-open like [1, 2), [2, 3), etc. If the earlier bins were closed then we would double-count certain elements that falls exactly on the bin edge; we don't have that problem with the last bin.
Another potential solution: BinMethod
But to muddy the waters a little more, if the elements of your array are integers I wouldn't specify BinWidth, BinEdges, or NumBins. I'd tell histogram or histcounts to use the 'integers' BinMethod. What does this method do? From the documentation page:
"The integer rule is useful with integer data, as it creates a bin for each integer. It uses a bin width of 1 and places bin edges halfway between integers.
To avoid accidentally creating too many bins, you can use this rule to create a limit of 65536 bins (2^16). If the data range is greater than 65536, then the integer rule uses wider bins instead."
x =[3 5 1 5 4 1 2 5 3 4];
[counts2,edgs2,bin2] = histcounts(x, 'BinMethod', 'integers')
counts2 = 1×5
2 1 2 2 3
edgs2 = 1×6
0.5000 1.5000 2.5000 3.5000 4.5000 5.5000
bin2 = 1×10
3 5 1 5 4 1 2 5 3 4
histogram(x, 'BinMethod', 'integers')

Più risposte (1)

Dyuman Joshi
Dyuman Joshi il 8 Nov 2023
Modificato: Dyuman Joshi il 8 Nov 2023
You can either specify the bins accordingly -
x = [3 5 1 5 4 1 2 5 3 4];
w = 1;
bins = min(x)-w/2:w:max(x)+w/2;
histogram(x, bins)
xlabel('x')
ylabel('frequency')
Or change the x-ticklabels.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by