divide the matrix (Rx2) into submatrices based on the values ​​of the second column

1 visualizzazione (ultimi 30 giorni)
HI! I tried to split the 'matrix_out' matrix into submatrices with steps of 0.1 and for the most part I succeeded.
load matrix_out
% =======
matrix_out_0 = matrix_out(matrix_out(:,2) < 0.1, :);
tot_percent_matrix_out_0 = sum(matrix_out_0(:,2));
matrix_separation_0 = [{matrix_out_0}, tot_percent_matrix_out_0];
% =======
matrix_separation = {};
j = 0.1:0.1:1.2;
for K = 1:width(j)
matrix_out_new = matrix_out((matrix_out(:,2) >= j(K) & matrix_out(:,2) < (0.1*K)+0.1), :);
tot_percent_matrix_out_new = sum(matrix_out_new(:,2));
matrix_separation = [matrix_separation; {matrix_out_new},tot_percent_matrix_out_new];
end
matrix_separation = [matrix_separation_0 ; matrix_separation]
matrix_separation = 13×2 cell array
{31×2 double} {[ 0.5500]} { 6×2 double} {[ 0.9100]} {69×2 double} {[17.9400]} {33×2 double} {[11.3900]} {13×2 double} {[ 5.7800]} {10×2 double} {[ 5.5900]} {10×2 double} {[ 6.4000]} { 8×2 double} {[ 6]} { 6×2 double} {[ 5.1300]} {11×2 double} {[10.4500]} {11×2 double} {[11.4000]} {14×2 double} {[15.9400]} { 3×2 double} {[ 3.6900]}
In the code, however, I noticed that the value 423|1.2 is found both in the penultimate and in the last cell inside 'matrix_separation'.
The value 423|1.2 should only appear in the last cell given the range >=1.2 & <1.3! Thanks to whoever solves this doubt...

Risposta accettata

Dyuman Joshi
Dyuman Joshi il 20 Set 2023
Modificato: Dyuman Joshi il 20 Set 2023
discretize and splitapply for the win!
load matrix_out
%Mention the bins to group data in
j = [0 0.1:0.1:1.2 Inf];
%Discretize the data
idx = discretize(matrix_out(:,2),j);
%Split the array according to the groups
out1 = splitapply(@(x) {x}, matrix_out, idx)
out1 = 13×1 cell array
{31×2 double} { 6×2 double} {69×2 double} {33×2 double} {13×2 double} {10×2 double} {10×2 double} { 8×2 double} { 6×2 double} {11×2 double} {11×2 double} {13×2 double} { 3×2 double}
You can see above that the 2nd last group is 13x2 instead of 14x2. The sum obtained will be modified accordingly as well.
%Get the sum of the 2nd column according to the groups
out2 = splitapply(@(x) sum(x), matrix_out(:,2), idx)
out2 = 13×1
0.5500 0.9100 17.9400 11.3900 5.7800 5.5900 6.4000 6.0000 5.1300 10.4500
%Concatenate to get the final output
out = [out1 num2cell(out2)]
out = 13×2 cell array
{31×2 double} {[ 0.5500]} { 6×2 double} {[ 0.9100]} {69×2 double} {[17.9400]} {33×2 double} {[11.3900]} {13×2 double} {[ 5.7800]} {10×2 double} {[ 5.5900]} {10×2 double} {[ 6.4000]} { 8×2 double} {[ 6]} { 6×2 double} {[ 5.1300]} {11×2 double} {[10.4500]} {11×2 double} {[11.4000]} {13×2 double} {[14.7400]} { 3×2 double} {[ 3.6900]}
  3 Commenti
Alberto Acri
Alberto Acri il 21 Set 2023
Modificato: Alberto Acri il 21 Set 2023
I was checking your code.
I noticed that in out{3,1} there are values (in the second column) between 0.2 and 0.3 (0.2 and 0.3 inclusive).
p.s. It didn't happen on the other matrices because there weren't the most extreme values.
I need to have intervals in the following way:
<0.10
>=0.10 & <0.20
>=0.20 & <0.30
>=0.30 & <0.40
...
Can you modify the code you provided me?
Dyuman Joshi
Dyuman Joshi il 22 Set 2023
What you are seeing is the limitation of floating point numbers.
load matrix_out
%Mention the bins to group data in
j = [0 0.1:0.1:1.2 Inf];
%% Let's see what the data is stored as
%First the matrix values
%Displayed value
disp(matrix_out(10:15,2))
0.0100 0.0100 0.2200 0.3000 0.2600 0.2400
%Stored value
fprintf('%0.42f\n',matrix_out(10:15,2))
0.010000000000000000208166817117216851329431 0.010000000000000000208166817117216851329431 0.220000000000000001110223024625156540423632 0.299999999999999988897769753748434595763683 0.260000000000000008881784197001252323389053 0.239999999999999991118215802998747676610947
%Now the values of the groups
%Displayed value
disp(j')
0 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 0.8000 0.9000 1.0000 1.1000 1.2000 Inf
%Stored values
fprintf('%0.42f\n',j)
0.000000000000000000000000000000000000000000 0.100000000000000005551115123125782702118158 0.200000000000000011102230246251565404236317 0.300000000000000044408920985006261616945267 0.400000000000000022204460492503130808472633 0.500000000000000000000000000000000000000000 0.599999999999999977795539507496869191527367 0.699999999999999955591079014993738383054733 0.799999999999999933386618522490607574582100 0.899999999999999911182158029987476766109467 1.000000000000000000000000000000000000000000 1.099999999999999866773237044981215149164200 1.199999999999999955591079014993738383054733 Inf
You can see that the values are not exactly 0.1, 0.2, 0.3 etc. The only values that are stored exactly as their decimal representation are the powers of 2 (0.5 = 2^-1, 1 = 2^0).
This means there will be some errors while working with floating point numbers.
So, what to do now? There is a workaround - Scale up the data to integers and operate.
As the data in the 2nd column of the matrix_out have values upto the 2nd digit after the decimal, so scale up by a factor of 10^2.
%Scale up by a factor of 100
%Scaling the data
vec = floor(matrix_out(:,2)*100);
%Scaling the bins
j = [0 10:10:120 Inf];
%Discretize the data according to the scaled values
idx = discretize(vec,j);
%Split the array according to the groups
out = splitapply(@(x) {x}, matrix_out, idx)
out = 13×1 cell array
{31×2 double} { 6×2 double} {62×2 double} {40×2 double} {13×2 double} {10×2 double} {10×2 double} { 8×2 double} { 6×2 double} {11×2 double} {11×2 double} {13×2 double} { 3×2 double}
disp(out{3,1})
262.0000 0.2200 264.0000 0.2600 265.0000 0.2400 266.0000 0.2400 267.0000 0.2000 269.0000 0.2500 270.0000 0.2600 271.0000 0.2900 274.0000 0.2700 275.0000 0.2200 276.0000 0.2500 277.0000 0.2100 278.0000 0.2700 279.0000 0.2200 280.0000 0.2300 281.0000 0.2600 282.0000 0.2900 283.0000 0.2700 284.0000 0.2200 285.0000 0.2400 286.0000 0.2300 287.0000 0.2500 288.0000 0.2600 289.0000 0.2400 290.0000 0.2500 291.0000 0.2600 292.0000 0.2400 293.0000 0.2500 295.0000 0.2900 296.0000 0.2600 297.0000 0.2500 299.0000 0.2600 300.0000 0.2800 301.0000 0.2600 302.0000 0.2400 303.0000 0.2700 304.0000 0.2500 306.0000 0.2600 308.0000 0.2500 309.0000 0.2800 310.0000 0.2500 311.0000 0.2500 313.0000 0.2500 315.0000 0.2600 316.0000 0.2700 317.0000 0.2600 319.0000 0.2600 321.0000 0.2900 322.0000 0.2100 324.0000 0.2600 325.0000 0.2700 326.0000 0.2900 327.0000 0.2900 328.0000 0.2700 330.0000 0.2600 332.0000 0.2500 337.0000 0.2900 339.0000 0.2900 345.0000 0.2800 448.0000 0.2800 449.0000 0.2600 450.0000 0.2100

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Characters and Strings in Help Center e File Exchange

Prodotti


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by