Generating dispersed (non-integer) random matrix/array that sums to a particular value

Question

J AI il 28 Giu 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/555907-generating-dispersed-non-integer-random-matrix-array-that-sums-to-a-particular-value

Modificato: J AI il 28 Giu 2020

One of the most suggested (in fact the only one to my finding) for generating random numbers (<1) that will sum to 1 is Random Vectors with Fixed Sum by Roger Stafford. However, what I noticed is that the data generated is not well dispersed. e.g.,

P = randfixedsum(10,10000,1,0.05,0.9); % a 10-by-100000 matrix where each column of P sums to 1 and each elements is between 0.05 and 0.9
find(any(P>0.5))
ans =
  1×0 empty double row vector

So far, every single time I tried it results in an empty vector - it always limits itself within below 0.5. Is there a way I could generate more dispersed data where it would include values between 0.05 and 0.9 (for the above example)?

Thanks in advance for your kind help.

FYI: I have tried this (took help from one of the MATLAB answers)

function P = rand_fixed_sum_2(p,n) % p number of columns, and n number of rows and each column sums to 1
    for j = 1:p
            n1=10^(n-1);
            m=1:n1;
            a=m(sort(randperm(n1,n)));
            b=diff(a);
            b(end+1)=n1-sum(b);
            P(:,j) = (b/sum(b))';
    end
    
end

But obviously the value of n1 is not feasible for higher dimensions (n>5). However, for lower dimensions, by tweaking n1, I could get much more dispersed data.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

John D'Errico il 28 Giu 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/555907-generating-dispersed-non-integer-random-matrix-array-that-sums-to-a-particular-value#answer_458233

Modificato: John D'Errico il 28 Giu 2020

Apri in MATLAB Online

I think you do not understand what you are asking.

randfixedssum indeed produces results that are uniformly sistributed within the sub-set in question. That is, any point in a 10 dimensional space that satisfies the requirements of a fixed sum is equally likely to arise.

However, that does not mean that it is at all probable you would find something that satisfies your goal, of "dispersion".

For example, suppose you were to choose one element that is greater than 0.5? Then the probability that the other 9 elements were ALL small enough that the sum is 1, is pretty low. In the 9 dimensional space that remains, that event would be actually very uncommon.

Thus, you want to generate 10 numbers, all of which lie between 0.05 and 0.9, such that the sum is 1.

Suppose, just suppose that one of the numbers was say, 0.6? Now what are the odds that you can find 9 other numbers that make the total sum exactly 1, but none of them are less than 0.05? SURPRISE! It can never be done.

In fact, if any simgle element was any larger than 0.55 in this example, your goal will never be doable. So if one element is as large as even 0.55+eps, it is mathematically impossible to find 9 numbers, all of which are between 0.05 and 0.9, such that the sum is 0.45-eps.

Next, suppose one element was even as large as 0.5? Just one element that large?

Now the other 9 elements must all be very close to 0.05. What is the probability of that event? Not surprisingly, it is pretty darn small. I can compute the actual probability of such an event to happen if you need. Being too lazy to think at this time of day...

X = randfixedsum(10,10000000,1,0,0.9);
sum(max(X) >= 0.5)
ans =
      195844

So 1.96e5 such events in 1e7. A little under 2% of the time. As expected, a rare event, and that is EXACTLY as it should be.

You ask for dispersion. But you don't seem to understand what dispersion means or what it implies in this context.

If I look at the distribution of the maximum of all 10 elements, I get something that is actually pretty reasonable.

X = randfixedsum(10,10000,1,0.05,0.9);
   Min     0.1207
0%     0.1342
0%     0.1445
0%     0.1524
0%     0.1674
0%     0.1884
0%     0.2167
0%     0.2503
0%     0.2738
0%     0.3143
   Max     0.4039

Most of the time, we get a maximum value that is pretty small in context. And that is because the sample truly is uniformly distributed around the constraint space. One point in that space is equally as likely to arise as any other point. But that does NOT mean that the maximum is ever likely to be larger than 0.55. In fact, that would be an impossible event.

Suppose instead, that we change the way things were generated? Now, instead of requiring that the min be 0.05. Just make it 0. How do the statistics change?

X = randfixedsum(10,10000,1,0,0.9);
   Min     0.1395
0%     0.1681
0%     0.1902
0%      0.205
0%     0.2353
0%     0.2784
0%     0.3359
0%      0.401
0%     0.4492
0%     0.5479
   Max     0.8123

As you now see, the maximum element is now considerably larger. In the same size sample, I once got something as large as 0.8123. There is now much more room for those "dispersed" events to arise.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

J AI il 28 Giu 2020

Modificato: J AI il 28 Giu 2020

Oh wow. really appreciate your detailed painstaking explanation. I can see how I got the whole thing messed up with my requirements. Thank you so much for clearing it up with such clarity.

Accedi per commentare.

Generating dispersed (non-integer) random matrix/array that sums to a particular value

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Generating dispersed (non-integer) random matrix/array that sums to a particular value

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti