The geometric distribution models the number of failures before one success in a series of independent trials, where each trial results in either success or failure, and the probability of success in any individual trial is constant. For example, if you toss a coin, the geometric distribution models the number of tails observed before getting a heads. The geometric distribution is discrete, existing only on the nonnegative integers.

The geometric distribution uses the following parameter.

Parameter | Description |
---|---|

$$0\le p\le 1$$ | Probability of success |

The probability distribution function (pdf) of the geometric distribution is

$$y=f(x|p)=p{(1-p)}^{x}\text{\hspace{1em}};\text{\hspace{1em}}x=0,1,2,\dots \text{\hspace{0.17em}},$$

where *p* is the probability of success, and *x* is
the number of failures before the first success. The result *y* is
the probability of observing exactly *x* trials before
a success, when the probability of success in any given trial is *p*.
For discrete distributions, the probability distribution function
is also known as the probability mass function (pmf).

This plot shows how changing the value of the probability parameter
*p* alters the shape of the pdf. Use `geopdf`

to compute the pdf for
values at *x* equals 1 through 10, for three different values
of *p*. Then plot all three pdfs on the same figure for a
visual comparison.

x = [1:10]; y1 = geopdf(x,0.1); % For p = 0.1 y2 = geopdf(x,0.25); % For p = 0.25 y3 = geopdf(x,0.75); % For p = 0.75 figure; plot(x,y1,'kd') hold on plot(x,y2,'ro') plot(x,y3,'b+') legend({'p = 0.1','p = 0.25','p = 0.75'}) hold off

In this plot, the value of *y* is the probability of
observing exactly *x* trials before a success. When the
probability of success *p* is large, *y*
decreases rapidly as *x* increases, and the probability of
observing a large number of failures before a success quickly becomes small. But
when the probability of success *p* is small,
*y* decreases slowly as *x* increases. The
probability of observing a large number of failures before a success still
decreases as the number of trials increases, but at a much slower rate.

A random number generated from a geometric distribution represents the number
of failures observed before a success in a single experiment, given the
probability of success *p* for each independent trial. Use
`geornd`

to generate random
numbers from the geometric distribution. For example, the following generates a
random number from a geometric distribution with probability of success
*p* equal to 0.1.

p = 0.1; r = geornd(p)

r = 1

The returned random number represents the number of failures observed before a success in a series of independent trials.

The geometric distribution is a special case of the negative binomial
distribution, with the specified number of successes parameter
*r* equal to 1.

The cumulative distribution function (cdf) of the geometric distribution is

$$y=F(x|p)=1-{\left(1-p\right)}^{x+1}\text{\hspace{0.17em}};\text{\hspace{0.17em}}x=0,1,2,\mathrm{...}\text{\hspace{0.17em}},$$

where *p* is the probability of success, and *x* is
the number of failures before the first success. The result *y* is
the probability of observing up to *x* trials before
a success, when the probability of success in any given trial is *p*.

This plot shows how changing the value of the parameter *p*
alters the shape of the cdf. Use `geocdf`

to compute the cdf
values at *x* equals 1 through 10, for three different values
of *p*. Then plot all three cdfs on the same figure for a
visual comparison.

x = [1:10]; y1 = geocdf(x,0.1); % For p = 0.1 y2 = geocdf(x,0.25); % For p = 0.25 y3 = geocdf(x,0.75); % For p = 0.75 figure; plot(x,y1,'kd') hold on plot(x,y2,'ro') plot(x,y3,'b+') legend({'p = 0.1','p = 0.25','p = 0.75'}) hold off

In this plot, the value of *y* is the probability of
observing up to *x* trials before a success. When the
probability of success *p* is large, *y*
increases rapidly as *x* increases. The probability of
observing a success quickly becomes very high, even for a small number of
trials. But when the probability of success *p* is small,
*y* increases slowly as *x* increases. The
probability of observing a success still increases as the number of trials
increases, but at a much slower rate.

The inverse cdf of a geometric distribution determines the value of
*x* that corresponds to a probability *y*
of observing *x* successes in a row in independent trials. Use
`geoinv`

to compute the inverse
cdf of the geometric distribution. For example, the following returns the
smallest possible integer *x* such that the geometric cdf
*y* evaluated at *x* is greater than or
equal to 0.1 , when the probability of success for each independent trial
*p* is 0.03.

y = 0.1; p = 0.03; x = geoinv(y,p)

x = 3

The mean of the geometric distribution is

$$\text{mean}=\frac{1-p}{p}\text{\hspace{0.17em}},$$

and the variance of the geometric distribution is

$$\mathrm{var}=\frac{1-p}{{p}^{2}}\text{\hspace{0.17em}},$$

where *p* is the probability of success.

Use `geostat`

to compute the mean and
variance of a geometric distribution. For example, the following computes the mean
*m* and variance *v* of a geometric
distribution with probability parameter *p* equal to 0.25.

p = 0.25; [m,v] = geostat(p)

m = 3

v = 12

Suppose the probability of a five-year-old car battery not starting in cold weather is 0.03. What is the probability of the car starting for 25 consecutive days during a long cold snap?

Model the scenario using a geometric distribution, where "failure" means the car starts, and "success" means the car does not start. Determine the probability of observing 25 failures (the car starts) without observing a single success (the car does not start). The probability of success for each trial (the car not starting on any single attempt) is *p* equal to 0.03.

Compute the cumulative distribution function (cdf) for *x* equal to 25. This returns the probability of observing success (the car not starting) in up to 25 trials.

x = 25; p = 0.03; psuccess = geocdf(x,p);

To determine the probability of not observing a success in up to 25 trials - in other words, the probability that the car starts on every one of the 25 attempts - subtract this result from 1.

pfail = 1 - psuccess

pfail = 0.4530

The returned result `pfail = 0.4530`

is the probability that the car will start every day for 25 days in a row during a cold snap.

The cdf plot shows that, as the number of trials (`x`

) increases, the probability of success (`y`

) also increases. In this example, it means that the more times you attempt to start the car, the greater the probability that it does not start on at least one of those occasions.

figure; x = 0:25; y = geocdf(x,0.03); stairs(x,y)

`geocdf`

| `geoinv`

| `geopdf`

| `geornd`

| `geostat`

| `random`