histcounts2

Bivariate histogram bin counts

Syntax

[N,Xedges,Yedges]
= histcounts2(X,Y)

[N,Xedges,Yedges]
= histcounts2(X,Y,nbins)

[N,Xedges,Yedges]
= histcounts2(X,Y,Xedges,Yedges)

[N,Xedges,Yedges]
= histcounts2(___,Name,Value)

[N,Xedges,Yedges,binX,binY]
= histcounts2(___)

Description

[N,Xedges,Yedges] = histcounts2(X,Y) partitions the values in X and Y into 2-D bins and returns the bin counts and the bin edges in each dimension. The histcounts2 function uses an automatic binning algorithm that returns uniform bins chosen to cover the range of values in X and Y and reveal the underlying shape of the distribution.

example

[N,Xedges,Yedges] = histcounts2(X,Y,nbins) specifies the number of bins to use in each dimension.

example

[N,Xedges,Yedges] = histcounts2(X,Y,Xedges,Yedges) partitions X and Y into bins with the bin edges specified by Xedges and Yedges.

example

[N,Xedges,Yedges] = histcounts2(___,Name,Value) specifies additional parameters using one or more name-value arguments for any of the previous syntaxes. For example, specify BinWidth as a two-element vector to adjust the width of the bins in each dimension.

example

[N,Xedges,Yedges,binX,binY] = histcounts2(___) also returns bin indices for the corresponding elements in X and Y.

example

Examples

collapse all

Bin Counts and Bin Edges

Open Live Script

Distribute 100 pairs of random numbers into bins. histcounts2 automatically chooses an appropriate bin width to reveal the underlying distribution of the data.

x = randn(100,1);
y = randn(100,1);
[N,Xedges,Yedges] = histcounts2(x,y)

N = 7×6

     0     0     0     2     0     0
     1     2    10     4     0     0
     1     4     9     9     5     0
     1     4    10    11     5     1
     1     4     6     3     1     1
     0     0     1     2     0     0
     0     0     1     0     1     0

Xedges = 1×8

    -3    -2    -1     0     1     2     3     4

Yedges = 1×7

    -3    -2    -1     0     1     2     3

Specify Number of Bins in Each Dimension

Open Live Script

Distribute 10 pairs of numbers into 12 bins. Specify 3 bins in the x-dimension, and 4 bins in the y-dimension.

x = [1 1 2 3 2 2 1 1 2 3];
y = [5 6 3 8 9 1 2 7 5 1];
nbins = [3 4];
[N,Xedges,Yedges] = histcounts2(x,y,nbins)

N = 3×4

     1     0     2     1
     1     1     1     1
     1     0     0     1

Xedges = 1×4

    0.6000    1.4000    2.2000    3.0000

Yedges = 1×5

         0    2.3000    4.6000    6.9000    9.2000

Specify Bin Edges

Open Live Script

Distribute 1,000 pairs of random numbers into bins. Define the bin edges with two vectors: one each for the x and y dimensions. The first element in each vector specifies the first edge of the first bin, and the last element is the last edge of the last bin.

x = randn(1000,1);
y = randn(1000,1);
Xedges = -5:5;
Yedges = [-5 -4 -2 -1 -0.5 0 0.5 1 2 4 5];
N = histcounts2(x,y,Xedges,Yedges)

N = 10×10

     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     1     1     1     0     0     0
     0     0     5     5     3     5     1     2     0     0
     0     2    19    23    29    25    26    20     5     0
     0    10    36    51    59    71    54    46    10     0
     0     7    43    46    79    64    60    46     9     0
     0     3    12    18    21    23    19     9     6     0
     0     0     5     3     2     8     2     2     0     0
     0     0     0     1     1     1     0     0     0     0
     0     0     0     0     0     0     0     0     0     0

Normalized Bin Counts

Open Live Script

Distribute 1,000 pairs of random numbers into bins. Specify Normalization as 'probability' to normalize the bin counts such that sum(N(:)) is 1. That is, each bin count represents the probability that an observation falls within that bin.

x = randn(1000,1);
y = randn(1000,1);
[N,Xedges,Yedges] = histcounts2(x,y,6,'Normalization','probability')

N = 6×6

         0         0    0.0020    0.0020         0         0
         0    0.0110    0.0320    0.0260    0.0070    0.0010
    0.0010    0.0260    0.1410    0.1750    0.0430    0.0060
         0    0.0360    0.1620    0.1940    0.0370    0.0040
         0    0.0040    0.0300    0.0370    0.0100    0.0010
         0    0.0030    0.0040    0.0040    0.0010         0

Xedges = 1×7

   -4.0000   -2.7000   -1.4000   -0.1000    1.2000    2.5000    3.8000

Yedges = 1×7

   -4.0000   -2.7000   -1.4000   -0.1000    1.2000    2.5000    3.8000

Determine Bin Placement

Open Live Script

Distribute 1,000 random integer pairs between -10 and 10 into bins, and specify BinMethod as 'integers' to use unit-width bins centered on integers. Specify five outputs for histcounts2 to return vectors representing the bin placement of the data.

x = randi([-10,10],1000,1);
y = randi([-10,10],1000,1);
[N,Xedges,Yedges,binX,binY] = histcounts2(x,y,'BinMethod','integers');

Determine which bin the value (x(3),y(3)) falls into.

[x(3),y(3)]

ans = 1×2

    -8    10

bin = [binX(3) binY(3)]

bin = 1×2

     3    21

Input Arguments

collapse all

`X,Y` — Data to distribute among bins (as separate arguments)
vectors | matrices | multidimensional arrays

Data to distribute among bins, specified as separate arguments of vectors, matrices, or multidimensional arrays. X and Y must have the same size.

Corresponding elements in X and Y specify the x and y coordinates of 2-D data points, [X(k),Y(k)]. The data types of X and Y can be different.

histcounts2 ignores all NaN values. Similarly, histcounts2 ignores Inf and -Inf values unless the bin edges explicitly specify Inf or -Inf as a bin edge.

`nbins` — Number of bins in each dimension
positive integer scalar | two-element vector of positive integers

Number of bins in each dimension, specified as a positive integer scalar or two-element vector of positive integers.

If nbins is a scalar, then histcounts2 uses that many bins in each dimension.
If nbins is a vector, then the first element gives the number of bins in the x-dimension, and the second element gives the number of bins in the y-dimension.

If you do not specify nbins, then histcounts2 automatically calculates how many bins to use based on the values in X and Y.

If you specify nbins with BinMethod or BinWidth, histcounts2 only honors the last parameter.

Example: [N,Xedges,Yedges] = histcounts2(X,Y,15) uses 15 bins in the x-dimension and in the y-dimension.

Example: [N,Xedges,Yedges] = histcounts2(X,Y,[15 20]) uses 15 bins in the x-dimension and 20 bins in the y-dimension.

`Xedges` — Bin edges in x-dimension
vector

Bin edges in x-dimension, specified as a vector. The first element specifies the leading edge of the first bin in the x-dimension. The last element specifies the trailing edge of the last bin in the x-dimension. The trailing edge is only included for the last bin.

If you specify Xedges and Yedges with BinMethod, BinWidth, or NumBins, histcounts2 only honors the bin edges and the bin edges must be specified last.
If you specify Xedges with XBinLimits, histcounts2 only honors the Xedges and the Xedges must be specified last.

`Yedges` — Bin edges in y-dimension
vector

Bin edges in y-dimension, specified as a vector. The first element specifies the leading edge of the first bin in the y-dimension. The last element specifies the trailing edge of the last bin in the y-dimension. The trailing edge is only included for the last bin.

If you specify Yedges and Xedges with BinMethod, BinWidth, or NumBins, histcounts2 only honors the bin edges and the bin edges must be specified last.
If you specify Yedges with YBinLimits, histcounts2 only honors the Yedges and the Yedges must be specified last.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: [N,Xedges,Yedges] = histcounts2(X,Y,'Normalization','probability') normalizes the bin counts in N, such that sum(N) is 1.

`BinWidth` — Width of bins in each dimension
two-element vector of positive values

Width of bins in each dimension, specified as a two-element vector of positive values. The first element gives the width of the bins in the x-dimension, and the second element gives the width of the bins in the y-dimension.

If you specify BinWidth, then histcounts2 can use a maximum of 1024 bins (2¹⁰) along each dimension. If instead the specified bin width requires more bins, then histcounts2 uses a larger bin width corresponding to the maximum number of bins.

If you specify BinWidth with BinMethod or NumBins, histcounts2 only honors the last parameter.

Example: histcounts2(X,Y,'BinWidth',[5 10]) uses bins with size 5 in the x-dimension and size 10 in the y-dimension.

`XBinLimits` — Bin limits in x-dimension
two-element vector

Bin limits in x-dimension, specified as a two-element vector, [xbmin,xbmax]. The first element indicates the first bin edge in the x-dimension. The second element indicates the last bin edge in the x-dimension.

This option only bins data that falls within the bin limits inclusively, X>=xbmin & X<=xbmax.

`YBinLimits` — Bin limits in y-dimension
two-element vector

Bin limits in y-dimension, specified as a two-element vector, [ybmin,ybmax]. The first element indicates the first bin edge in the y-dimension. The second element indicates the last bin edge in the y-dimension.

This option only bins data that falls within the bin limits inclusively, Y>=ybmin & Y<=ybmax.

`BinMethod` — Binning algorithm
`'auto'` (default) | `'scott'` | `'fd'` | `'integers'`

Binning algorithm, specified as one of the values in this table.

Value	Description
`'auto'`	The default `'auto'` algorithm chooses a bin width to cover the data range and reveal the shape of the underlying distribution.
`'scott'`	Scott’s rule is optimal if the data is close to being jointly normally distributed. This rule is appropriate for most other distributions, as well. It uses a bin size of `[3.5std(X(:))numel(X)^(-1/4), 3.5std(Y(:))numel(Y)^(-1/4)]`.
`'fd'`	The Freedman-Diaconis rule is less sensitive to outliers in the data, and might be more suitable for data with heavy-tailed distributions. It uses a bin size of `[2iqr(X(:))numel(X)^(-1/4), 2iqr(Y(:))numel(Y)^(-1/4)]`, or when `X` contains extreme outliers, `[0.2(max(X(:))-min(X(:)))numel(X)^(-1/4), 0.2(max(Y(:))-min(Y(:)))numel(Y)^(-1/4)]`.
`'integers'`	The integer rule is useful with integer data, as it creates bins centered on pairs of integers. It uses a bin width of 1 for each dimension and places bin edges halfway between integers. To avoid accidentally creating too many bins, you can use this rule to create a limit of 1024 bins (2¹⁰). If the data range for either dimension is greater than 1024, then the integer rule uses wider bins instead.

histcounts2 adjusts the number of bins slightly so that the bin edges fall on "nice" numbers, rather than using these exact formulas.
If you set the NumBins, XBinEdges, YBinEdges, BinWidth, XBinLimits, or YBinLimits properties, then BinMethod is set to 'manual'.
If you specify BinMethod with BinWidth or NumBins, histcounts2 only honors the last parameter.

Example: histcounts2(X,Y,'BinMethod','integers') centers the 2-D bins on each pair of integers.

`Normalization` — Type of normalization
`'count'` (default) | `'probability'` | `'percentage'` | `'countdensity'` | `'cumcount'` | `'pdf'` | `'cdf'`

Type of normalization, specified as one of the values in this table. For each bin i:

$v_{i}$ is the bin value.
$c_{i}$ is the number of elements in the bin.
$A_{i} = w_{x i} \cdot w_{y i}$ is the area of the bin, computed using the x and y bin widths.
$N$ is the number of elements in the input data. This value can be greater than the binned data if the data contains missing values or if some of the data lies outside the bin limits.

Value	Bin Values	Notes
`'count'` (default)	$v_{i} = c_{i}$	Count or frequency of observations. Sum of bin values is at most `numel(X)` and `numel(y)`. The sum is less than this only when some of the input data is not included in the bins.
`'probability'`	$v_{i} = \frac{c_{i}}{N}$	Relative probability. The number of elements in each bin relative to the total number of elements in the input data is at most 1.
`'percentage'`	$v_{i} = 100 * \frac{c_{i}}{N}$	Relative percentage. The percentage of elements in each bin is at most 100.
`'countdensity'`	$v_{i} = \frac{c_{i}}{A_{i}}$	Count or frequency scaled by area of bin. `N(end,end)` is at most `numel(X)` and `numel(Y)`.
`'cumcount'`	$v_{i} = \sum_{j = 1}^{i} c_{j}$	Cumulative count, or the number of observations in each bin and all previous bins in both the x and y dimensions. `N(end,end)` is at most `numel(X)` and `numel(Y)`.
`'pdf'`	$v_{i} = \frac{c_{i}}{N \cdot A_{i}}$	Probability density function estimate. The sum of the bin volumes is at most `1`.
`'cdf'`	$v_{i} = \sum_{j = 1}^{i} \frac{c_{j}}{N}$	Cumulative distribution function estimate. `N(end,end)` is at most 1.

Example: histcounts2(X,Y,'Normalization','pdf') bins the data using an estimate of the probability density function.

Output Arguments

collapse all

`N` — Bin counts
array

Bin counts, returned as an array.

The binning scheme includes the leading x-dimension and y-dimension edge of each bin as well as the trailing edge for the last bins along the x-dimension and y-dimension.

Sample matrix showing the bin inclusion scheme as well as the relative orientation of the bins to the x-axis and y-axis

For example, the (1,1) bin includes values that fall on the first edge in each dimension, and the last bin in the bottom right includes values that fall on any of its edges.

`Xedges` — Bin edges in x-dimension
vector

Bin edges in x-dimension, returned as a vector. The first element is the leading edge of the first bin in the x-dimension. The last element is the trailing edge of the last bin in the x-dimension.

`Yedges` — Bin edges in y-dimension
vector

Bin edges in y-dimension, returned as a vector. The first element is the leading edge of the first bin in the y-dimension. The last element is the trailing edge of the last bin in the y-dimension.

`binX` — Bin index in x-dimension
array

Bin index in x-dimension, returned as an array of the same size as X. Corresponding elements in binX and binY describe which numbered bin contains the corresponding values in X and Y. A value of 0 in binX or binY indicates an element that does not belong to any of the bins (such as a NaN value).

For example, binX(1) and binY(1) describe the bin placement for the value [X(1),Y(1)].

`binY` — Bin index in y-dimension
array

Bin index in y-dimension, returned as an array of the same size as Y. Corresponding elements in binX and binY describe which numbered bin contains the corresponding values in X and Y. A value of 0 in binX or binY indicates an element that does not belong to any of the bins (such as a NaN value).

For example, binX(1) and binY(1) describe the bin placement for the value [X(1),Y(1)].

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Code generation does not support sparse matrix inputs for this function.
If you do not supply bin edges, then code generation might require variable-size arrays and dynamic memory allocation.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.

Version History

Introduced in R2015b

expand all

R2023b: Normalize using percentages

You can normalize histogram values as percentages by specifying the Normalization name-value argument as 'percentage'.

histcounts2

Syntax

Description

Examples

Bin Counts and Bin Edges

Specify Number of Bins in Each Dimension

Specify Bin Edges

Normalized Bin Counts

Determine Bin Placement

Input Arguments

X,Y — Data to distribute among bins (as separate arguments) vectors | matrices | multidimensional arrays

nbins — Number of bins in each dimension positive integer scalar | two-element vector of positive integers

Xedges — Bin edges in x-dimension vector

Yedges — Bin edges in y-dimension vector

Name-Value Arguments

BinWidth — Width of bins in each dimension two-element vector of positive values

XBinLimits — Bin limits in x-dimension two-element vector

YBinLimits — Bin limits in y-dimension two-element vector

BinMethod — Binning algorithm 'auto' (default) | 'scott' | 'fd' | 'integers'

Normalization — Type of normalization 'count' (default) | 'probability' | 'percentage' | 'countdensity' | 'cumcount' | 'pdf' | 'cdf'

Output Arguments

N — Bin counts array

Xedges — Bin edges in x-dimension vector

Yedges — Bin edges in y-dimension vector

binX — Bin index in x-dimension array

binY — Bin index in y-dimension array

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

Thread-Based Environment Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

Version History

R2023b: Normalize using percentages

See Also

`X,Y` — Data to distribute among bins (as separate arguments)
vectors | matrices | multidimensional arrays

`nbins` — Number of bins in each dimension
positive integer scalar | two-element vector of positive integers

`Xedges` — Bin edges in x-dimension
vector

`Yedges` — Bin edges in y-dimension
vector

`BinWidth` — Width of bins in each dimension
two-element vector of positive values

`XBinLimits` — Bin limits in x-dimension
two-element vector

`YBinLimits` — Bin limits in y-dimension
two-element vector

`BinMethod` — Binning algorithm
`'auto'` (default) | `'scott'` | `'fd'` | `'integers'`

`Normalization` — Type of normalization
`'count'` (default) | `'probability'` | `'percentage'` | `'countdensity'` | `'cumcount'` | `'pdf'` | `'cdf'`

`N` — Bin counts
array

`Xedges` — Bin edges in x-dimension
vector

`Yedges` — Bin edges in y-dimension
vector

`binX` — Bin index in x-dimension
array

`binY` — Bin index in y-dimension
array

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.