Is it possible to create a histogram with fractional entries for each bin?
9 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Thank you for looking at my question! I have included a brief introduction below; any suggestions or comments would be greatly appreciated!
Traditional histograms are generated using an array (e.g. sample_array = [1,1,1,2,2,3,3,3,3,4]) and the histogram is generated using h = histogram(sample_array,nbins);. In this example, with nbins = 4, I would have a simple histogram of column height associated with the number of times a particular value is observed in the sample array.
However, in my work I have come upon the need to instead use an array in place of a single value. For example:
sample_array = [1,1,[1,2],2,2,3,[2,3,4,5],3,4];
I am aware this is not an array. For convenience I am instead using a cell to contain the data:
sample_cell = {1,1,[1,2],2,2,3,[2,3,4,5],3,4};
What I need to do is generate the resulting histogram of sample_cell where I give EACH ENTRY of the cell EQUAL WEIGHT. The corresponding weights would be as follows:
sample_weight = {1,1,[1/2,1/2],1,1,1,[1/4,1/4,1/4,1/4],1,1};
From this, the resulting histogram would have the following counts in the bins for 1 thru 4:
Bin: Count
1: 2.5
2: 2.75
3: 2.25
4: 1.25
I am looking for a way to generate this resulting histogram which does not include using the least common multiple of the sizes of each entry. (I have a temporary solution to the problem including this quantity, however, I am unable to scale it up properly as I am dealing with very large prime numbers which result in LCM > 10^9.)
Again, any help or suggestions that you might have would be greatly appreciated!
0 Commenti
Risposta accettata
David Young
il 6 Ago 2015
Modificato: David Young
il 6 Ago 2015
If all the samples are positive integers, and the bins are all centred on the positive integers and with unit width, as in the initial example, you can just do this:
% data
sample_cell = {1,1,[1,2],2,2,3,[2,3,4,5],3,4};
samples = cat(2, sample_cell{:});
weight_cell = cellfun(@(a) ones(size(a))/length(a), sample_cell, ...
'UniformOutput', false);
weights = cat(2, weight_cell{:});
counts = accumarray(samples(:), weights(:)).';
If this isn't the case (as in your more accurate example in the comments), you have to modify the code above by putting the samples into bins before weighting and counting them. This then looks like this:
% data and histogram parameters
sample_cell = {[0,0.41],0.32,[0.13,0.67,0.2],0.9,[0.3,1,0.89]};
edges = 0:0.1:1;
% put all the samples into one vector, and make a vector of their weights
samples = cat(2, sample_cell{:});
weight_cell = cellfun(@(a) ones(size(a))/length(a), sample_cell, ...
'UniformOutput', false);
weights = cat(2, weight_cell{:});
% work out which bin of the histogram each sample falls into
bins = discretize(samples, edges);
% Now form the counts, applying the weights for each sample
wtdcounts = accumarray(bins(:), weights(:)).';
% and normalise to probabilities
normcounts = wtdcounts/sum(wtdcounts); % normalise to sum to 1
% plot like histogram
centres = conv(edges, [0.5 0.5], 'valid');
bar(centres, normcounts, 1);
This gives the same results as the code in your comment, but will be a great deal more economical I think.
3 Commenti
David Young
il 6 Ago 2015
I've modified my answer to deal with the more general case. The second piece of code in the answer gives the same results as your lcm code above on the test data.
Più risposte (0)
Vedere anche
Categorie
Scopri di più su Histograms in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!