Documentation

This is machine translation

Mouseover text to see original. Click the button below to return to the English version of the page.

Binomial Distribution

Overview

The binomial distribution models the total number of successes in repeated trials from an infinite population under the following conditions:

• Only two outcomes are possible on each of n trials.

• The probability of success for each trial is constant.

• All trials are independent of each other.

Parameters

The binomial distribution uses the following parameters.

ParameterDescriptionSupport
`N`Number of trialspositive integer
`p`Probability of success$0\le p\le 1$

Probability Density Function

The probability density function (pdf) is

`$f\left(x|N,p\right)=\left(\begin{array}{c}N\\ x\end{array}\right){p}^{x}{\left(1-p\right)}^{N-x}\text{ };\text{ }x=0,1,2,...,N\text{\hspace{0.17em}},$`

where x is the number of successes in n trials of a Bernoulli process with probability of success p.

Mean and Variance

The mean is

`$\text{mean}=np\text{\hspace{0.17em}}.$`

The variance is

`$\mathrm{var}=np\left(1-p\right)\text{\hspace{0.17em}}.$`

Relationship to Other Distributions

The binomial distribution is a generalization of the Bernoulli distribution, allowing for a number of trials n greater than 1. The binomial distribution generalizes to the multinomial distribution when there are more than two possible outcomes for each trial.

Example

Suppose you are collecting data from a widget manufacturing process, and you record the number of widgets within specification in each batch of 100. You might be interested in the probability that an individual widget is within specification. Parameter estimation is the process of determining the parameter, p, of the binomial distribution that fits this data best in some sense.

One popular criterion of goodness is to maximize the likelihood function. The likelihood has the same form as the binomial pdf above. But for the pdf, the parameters (n and p) are known constants and the variable is x. The likelihood function reverses the roles of the variables. Here, the sample values (the x's) are already observed. So they are the fixed constants. The variables are the unknown parameters. MLE involves calculating the value of p that give the highest likelihood given the particular set of data.

The function `binofit` returns the MLEs and confidence intervals for the parameters of the binomial distribution. Here is an example using random numbers from the binomial distribution with n = 100 and p = 0.9.

```rng default; % for reproducibility r = binornd(100,0.9)```
```r = 85 ```
`[phat, pci] = binofit(r,100)`
```phat = 0.8500 ```
```pci = 1×2 0.7647 0.9135 ```

The MLE for parameter p is 0.8800, compared to the true value of 0.9. The 95% confidence interval for p goes from 0.7998 to 0.9364, which includes the true value. In this made-up example you know the “true value” of p. In experimentation you do not.

The following commands generate a plot of the binomial pdf for n = 10 and p = 1/2.

```x = 0:10; y = binopdf(x,10,0.5); plot(x,y,'+')``` 