Generate random values from a pdf within a given interval

Question

0 voti

Hi,

I wish to generate random values from a given probability density distribution that's only within a certain interval. Is there a smart way to do this?

Chris

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

Walter Roberson il 5 Set 2016

1 voto

Whether there is a smart way or not depends a lot on the pdf. If you are using uniform random then, Yes, there is. If you are using beta distribution, adjust the parameters to only cover that range. For anything else, typically there is no "smart" way of doing it, partly because any other kind of distribution has at least one infinite tail and truncating an infinite distribution gives you something that does not belong to any probability distribution.

There is a convenience function if you are using Statistics Toolbox: http://www.mathworks.com/help/stats/prob.truncatabledistribution.truncate.html

2 Commenti
Mostra Nessuno Nascondi Nessuno

Chris Egeland Busuttil il 5 Set 2016

Thats what I thought.. I ended up using a "while" loop. Not elegant, but it does the trick. Thanks for the help.

Walter Roberson il 5 Set 2016

The rejection method is very powerful.

If you have an estimate of the fraction you are going to accept then it can sometimes be more efficient to generate 1/fraction times as many values as you need, then do the rejection, then test to see if you are still short of values, and if so then generate a small batch... and keep doing this until you have enough values. If you end up with more than you need, either keep them for future or throw them away. This approach can be much more efficient than generating one by one. You can fine tune the generation ratio to reduce the number of generation calls you make.

Generating a big batch and rejecting then, and storing what is left and giving those out at need, can be a pretty useful approach.

Accedi per commentare.

Answer 2

John D'Errico il 5 Set 2016

Modificato: John D'Errico il 5 Set 2016

Apri in MATLAB Online

2 voti

It does very much depend on the distribution. And, yes, one could write a book on the subject. Luckily, people already have done so. It very much depends on the existence of an inverse CDF function for your distribution, or an approximation for one. It also helps if a transformation exists between distributions.

There are several valid approaches, and it very much depends on the distribution.

1. acceptance-rejection. Here you generate samples, rejecting those that fall outside the limits. Keep those that are inside. This is great if the probability is high that one will lie inside the acceptable limits. But if the rejection fraction is too large, then you may never see an acceptable result.

2. Transformations. Some distributions are transformable to something simple to work with. Lots of the classic distributions can be transformed exactly into either a beta distribution, a gamma distribution, or a normal distribution. Each of these have direct methods to work with the inverse of the CDF. Sometimes the transformation is only approximate, like that between the Poisson and a normal. It is fine for some Poisson parameters, but not most of them, and not the ones that are interesting.

3. If the inverse CDF is available, or a transformation thereto, then use it!

Again, you can find entire books written on these subjects.

Some of the classic texts (said early in the am, without looking at my bookshelf because it is too dark and I am sleepy) are Johnson & Kotz (Balakrishnan was a co-author on later editions of this multi-volume tome), and Abramowitz & Stegun.

Suppose your goal was to work with a Normal distribution. Even that can be done using either of the main approaches above.

Suppose we wish to generate normal deviates from a standard unit normal, so mean 0, variance 1, but those that lie within the interval [a,b]. As an example, I'll use the interval [1,2]. Acceptance-rejection would work fine here, but lets not.

ab = [1 2];
ab_cdf = normcdf(ab,0,1)
ab_cdf =
      0.84134      0.97725

Here, I have the stats toolbox, and it has normcdf. Yes, were I without that useful tool, I could have done the same with a transformation of erf or erfc. So knowing ones special functions is crucial. That is why you need a copy of Abramowitz and Stegun on your shelf.

The above numbers can be viewed as probabilities.

diff(ab_cdf)
ans =
      0.13591

By the way, there is a 13.6% rate that rejection will succeed here, so you will need to oversample by about 7-1, rejecting roughly 7 samples for every one that lies inside the interval of interest.

So instead, we now generate UNIFORM deviates, that live within the interval bounded by ab_cdf.

N = 1000;
u = rand(1,N)*diff(ab_cdf) + ab_cdf(1);
x = norminv(u,0,1);
[min(x),max(x)]
ans =
       1.0001       1.9961

So it appears as if we have succeeded. In fact, we have. No rejection was needed at all. You could use hist to convince yourself they are Normally distributed, in the interval of interest. Easy to do using normcdf and norminv, but almost as easy using erfc and erfcinv, were I lacking the stats TB. All that is needed is to know the transformation from the special function erfc into the normal CDF.

So understanding how to work with the special functions erf/erfc, and the incomplete beta and gamma functions are all crucial.

Suppose you gave me a distribution where ONLY the PDF was available? Personally, I might decide to compute a very accurate approximation to the CDF using numerical integration, so the CDF, at a discrete set of points. Then I might approximate the CDF using a spline in the form of pchip. (Pchip is the choice because it is guaranteed to be monotonic.) But it is rather easy to form the inverse of a pchip interpolant, so I would now have an effective inverse CDF too.

Any of the above methods will fail SOMETIMES. For example, compute samples from a unit Normal that lie in the interval [1000 1010].

Acceptance-rejection will take somewhat longer than the lifetime of the universe to generate one sample that lies in that interval. And the inverse CDF method will fail due to floating point problems. My solution here would be to recognize there are simple schemes for approximating the tails of the normal CDF way out there. These approximations are even pretty accurate. (Abramowitz & Stegun will be your go-to text here.)

2 Commenti
Mostra Nessuno Nascondi Nessuno

Chris Egeland Busuttil il 5 Set 2016

This is quite clever! I have the CDF for each density function, so that would not be a problem. At the moment I solved it using a while loop. Not the most efficient way, but it does the trick. I definitely will look into your second method in the future.

Acceptance-rejection will not be adequate as the rejection probability is to high.

Thanks for the help!

John D'Errico il 5 Set 2016

There are lots of tricks one can learn here. A quick scan through the appropriate chapter of Johnson & Kotz (& Balakrishnan) is well worth your time. An expensive book as I recall, but I used it a fair amount when I was writing my stats toolbox, and you can surely find it in a university library. My copy of Abramowitz & Stegun was so heavily used that I put tabs on the pages for the probability distributions, beta & gamma functions, and orthogonal polynomials.

Accedi per commentare.

Answer 3

KSSV il 5 Set 2016

Apri in MATLAB Online

0 voti

Random numbers within a specific range can be generated using:

N = 1000 ;
a = 50;
b = 100;
r = (b-a).*rand(N,1) + a;

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

Walter Roberson il 5 Set 2016

That only works for uniform random numbers.

Accedi per commentare.

Generate random values from a pdf within a given interval

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

2 Commenti
Mostra Nessuno Nascondi Nessuno

Più risposte (2)

2 Commenti
Mostra Nessuno Nascondi Nessuno

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

Categorie

Tag

Community Treasure Hunt

Generate random values from a pdf within a given interval

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

2 Commenti Mostra Nessuno Nascondi Nessuno

Più risposte (2)

2 Commenti Mostra Nessuno Nascondi Nessuno

1 Commento Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

Categorie

Tag

Vedere anche

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

2 Commenti
Mostra Nessuno Nascondi Nessuno

2 Commenti
Mostra Nessuno Nascondi Nessuno

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti