How can I obtain all possible combinations of 3 decimals whose sum equals 1, without running into memory and space issues?

5 visualizzazioni (ultimi 30 giorni)
Hi all,
I am trying to obtain all possible combinations for 4 identical vectors (v1, v2, v3, and v4) whose sum equals 1.
v1 = 0:0.001:1;
v2 = 0:0.001:1;
v3 = 0:0.001:1;
v4 = 0:0.001:1;
I have tried several options, but I run into memory issues every time (not surprisingly as all the possible combinations are 1001^4), i.e.:
vF = combvec(v1,v2,v3,v4);
j=find(sum(vF)==1);
vF=vF(:,j);
% >> Error using zeros
% >> Requested 2x1003003001 (14.9GB) array exceeds maximum array size preference (13.9GB). This might cause MATLAB to become unresponsive.
% or
vF = combinations(v1,v2,v3,v4);
j=find(sum(vF)==1);
vF=vF(:,j);
% >> Error using repelem
% >> Requested 1004006004001x1 (7480.4GB) array exceeds maximum array size preference (13.9GB). This might cause MATLAB to become unresponsive.
However, as I only need those combinations whose sum =1, I thought there might be an alternative approach.
Can you suggest a way to achieve the above, if possible (with 'normal computers')?
Thanks a lot in advance!

Risposte (4)

Bruno Luong
Bruno Luong il 15 Lug 2023
Modificato: Bruno Luong il 15 Lug 2023
The array is about 5.4Gb quite manageble on modern computer
>> vF=allVL1(4,1000)/1000;
>> whos
Name Size Bytes Class Attributes
vF 167668501x4 5365392032 double
>> vF([1 1000 end],:) % check few outputs
ans =
0 0 0 1.0000
0 0 0.9990 0.0010
1.0000 0 0 0
Generate in single to save 50% memory , no problem
>> vF=allVL1(4,single(1000))/single(1000);
>> whos
Name Size Bytes Class Attributes
vF 167668501x4 2682696016 single
>> vF([1 2023 end],:)
ans =
3×4 single matrix
0 0 0 1.0000
0 0.0020 0.0210 0.9770
1.0000 0 0 0

Torsten
Torsten il 15 Lug 2023
Spostato: Torsten il 15 Lug 2023
If you have some time, you can try I = 0:0.001:1. Why don't you use the code from the File Exchange ?
I = 0:0.01:1;
n = numel(I);
vF = [];
for i = 1:n
vi = I(i);
for j = 1:n
vj = I(j);
for k = 1:n
vk = I(k);
s = vi+vj+vk;
if s<=1
vl = 1-s;
vF = [vF,[vi;vj;vk;vl]];
end
end
end
end
size(vF)
ans = 1×2
4 176817
  3 Commenti
Torsten
Torsten il 15 Lug 2023
Modificato: Torsten il 15 Lug 2023
"randfixedsum" does what it claims to do: supply quadruples (x1,x2,x3,x4) that are uniformly distributed on x1+x2+x3+x4 = 1, xi>=0.
If you claim you need more extreme values, you don't need the quadruples to be uniformly distributed, but with a bias to the edges.
If you generate uniformly distributed numbers on [0 1], you will also get quite a small number of values greater than 0.999 or smaller than 0.001, I guess.
Maybe after applying "randfixedsum", you can add these extreme cases manually to the matrix returned by the function "randfixedsum".
Simone A.
Simone A. il 15 Lug 2023
'Maybe after applying "randfixedsum", you can add these extreme cases manually to the matrix returned by the function "randfixedsum".'
That could be an option that I could take into consideration with some workaround, I just need to figure out how.
'If you generate uniformly distributed numbers on [0 1], you will also get quite a small number of values greater than 0.999 or smaller than 0.001, I guess.'
Not sure about it, I should check just out of curiosity!
Thanks once again

Accedi per commentare.


Mrutyunjaya Hiremath
Mrutyunjaya Hiremath il 15 Lug 2023
% Set the precision for the sum comparison
precision = 1e-3;
% Generate a range of values with three decimal points
values = 0:0.001:1;
combinations = {};
% Nested loop to iterate through all combinations of the vectors
for v1 = values
for v2 = values
for v3 = values
for v4 = values
% Check if the sum is approximately equal to 1
if abs(v1 + v2 + v3 + v4 - 1) < precision
combination = [v1, v2, v3, v4];
combinations{end + 1} = combination;
end
end
end
end
end
size(combinations)
combinations

DGM
DGM il 15 Lug 2023
Modificato: DGM il 15 Lug 2023
Are the particular vectors even important here? Or are you trying to find the combinations of four decimal numbers between 0.000 and 0.999 which have a sum of 1?
I'm going to choose to suspect that. If so, you can just calculate them. There's a lot of them, so either you're going to spend time or memory to do it. I'm going to lean on memory resources. Peak memory usage is probably over 12GB, but the individual arrays are around 2GB.
I'm going to do this scaled by 1000. That lets us work with integers, which means rounding isn't an issue, and it also lets us use a smaller numeric class to save memory. If you want decimals, cast the output as float and divide by 1000.
S = 1000; % scaled by 1000
[aa bb cc] = meshgrid(int16(0:S-1));
dd = S - (aa + bb + cc);
mask = dd >= 0;
out = [aa(mask) bb(mask) cc(mask) dd(mask)];
out = out(2:end,:); % I'm assuming that [0.000 0.000 0.000 1.000] is not a candidate
% check the row sums
all(sum(out,2)==S)
ans = logical
1
% check that all elements are <N
all(out(:)<S)
ans = logical
1
% how many results?
size(out)
ans = 1×2
167668497 4
There are probably smarter ways, but I'm not the clever one.
  2 Commenti
Simone A.
Simone A. il 15 Lug 2023
Hi @DGM, this is definetely the faster one I've seen so far (and it does the job!!). If I would like to increase the decimals, i.e.,
v1 = 0:0.0001:1; % or v1 = 0:0.00001:1;
v2 = v1;
v3 = v1;
v4 = v1;
Should I just change S = 10000; I might be wrong but I have tried that and it gave me:
% Error using repmat
% Requested 10000x10000x10000 (1862.6GB) array exceeds maximum array size preference (15.8GB). This might
% cause MATLAB to become unresponsive.
%
% Error in meshgrid (line 80)
% xx = repmat(xx, ny, 1, nz);
%
% Error in untitled8 (line 8)
% [aa bb cc] = meshgrid(int16(0:S-1));
%
% Related documentation
Thanks for your time!
DGM
DGM il 15 Lug 2023
Modificato: DGM il 15 Lug 2023
So long as there is the same number of terms in the sum, the set of solutions occupies the same fraction of the input volume (roughly 1/6). If you scale the input volume by three orders of magnitude, so scales the number of solutions.
The result will occupy around 1.3-1.4 terabytes in memory as uint16. If you wanted that as single-precision float, it would require twice that. Regardless of the method used to obtain it, you probably can't fit the result in memory.

Accedi per commentare.

Prodotti


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by