Concatenate arrays of different length into a matrix

Assume I have two arrays (time-series) of the form:
A = [NaN, 2, 3, 4, 5, 6, 7, NaN]
B = [5, NaN, 6, 7, NaN, 8, 9, 10, 11, 12]
Since two arrays of different length can not be horzcat (obviously), how can I combine them as to obtain a 8x2 matrix where available data match. I have long time-series, so this is just an example, but it points out how crucial it is to have matching observations. Ideally, the output should be:
C = [NaN, 2, 3, 4, 5, 6, 7, NaN; 5, NaN, 6, 7, NaN, 8, 9, 10]
Thanks
Stefano

3 Commenti

jonas
jonas il 29 Ago 2018
Modificato: jonas il 29 Ago 2018
I have a feeling that you can fix this easily with synchronize() if you just provide some more details and data. I guess you want to sync two time-series in time?
"Since two arrays of different length can not be horzcat (obviously),"
I didn't have any problems using horzcat:
>> A = [NaN, 2, 3, 4, 5, 6, 7, NaN];
>> B = [5, NaN, 6, 7, NaN, 8, 9, 10, 11, 12];
>> horzcat(A,B)
ans =
NaN 2 3 4 5 6 7 NaN 5 NaN 6 7 NaN 8 9 10 11 12
Apologise for the misunderstanding Stephen. The arrays are column vectors of the form
A = [NaN; 2; 3; 4; 5; 6; 7; NaN];
B = [5; NaN; 6; 7; NaN; 8; 9; 10; 11; 12];
Therefore dimensions are inconsistent for horzcat.
Thanks jonas. I'll have a look at synchronize(). By the way I need to run the MS_Regress_Fit function where the dependent variable is a matrix of two columns. My imported data are all vectors of size 187x1 with NaN. The problem is that MS_Regress_Fit does not accept NaN in the time-series. Therefore, before concatenating, I need to
milliq1(isnan(milliq1))=[];
lnavilliq(isnan(lnavilliq))=[];
But this command reduces the dimensions according to the number of NaNs so I'm unable to concatenate the two arrays.
Thanks
S

Accedi per commentare.

 Risposta accettata

Stephen23
Stephen23 il 29 Ago 2018
Modificato: Stephen23 il 29 Ago 2018
Truncate to shortest length using indexing:
>> N = min(numel(A),numel(B));
>> [A(1:N);B(1:N)]
ans =
NaN 2 3 4 5 6 7 NaN
5 NaN 6 7 NaN 8 9 10
Pad to longest length using padcat:
>> padcat(A,B)
ans =
NaN 2 3 4 5 6 7 NaN NaN NaN
5 NaN 6 7 NaN 8 9 10 11 12

5 Commenti

Thanks for your reply. This is a solution of the problem! However, may I also ask you how to do achieve the same result but firstly eliminating NaN?
Thanks
S
Presumably you want to keep as much data as you can: first remove the NaNs, then join together as shown. What do you expect the output to look like? Please show using your example vectors.
This is what I get when I concatenate the two arrays. There are NaN cells that must be excluded, otherwise the MS_Regress_Fit function doesn't work (Error using checkInputs (line 155) NaN values found for row #138, column #1 of dep matrix.). However, if I just remove NaN with
lnavilliq(isnan(lnavilliq))=[];
I get a shorter vector where observations do not correspond anymore to the same month, but they switch. The best solution would be to substitute it with blank (missing value) that hopefully is accepted by the function. Substituting with zeroes is not the best option, since zeroes are numbers that participate in the estimation.
The two concatenated arrays have the same length, including NaN, and look like this:
c =
1.2983 0.4773
1.1332 -0.6899
1.8560 -0.0129
1.0032 -1.4503
0.7970 0.5510
0.9739 -0.1946
1.3806 -0.2787
0.5555 -1.5485
0.4629 -0.0439
1.0797 2.0374
1.2445 -0.6657
1.1939 -0.2186
1.6811 0.3529
1.5634 2.0443
1.7282 -1.1497
1.4664 -1.8472
1.5607 0.1371
1.3378 -0.2496
1.7870 -0.4482
1.0397 -0.5281
1.9086 -1.6239
1.6873 -1.4588
1.5608 -0.3559
1.8775 0.6770
1.9413 -0.7665
1.4411 2.5044
0.8023 1.2282
2.0759 -0.1093
2.0066 -0.1057
1.3641 -0.6416
1.9938 0.3063
1.6085 -1.7264
1.1130 1.5654
2.2765 -1.6221
1.3514 -0.6744
1.3893 -1.1095
1.3587 -0.6436
1.3541 -0.4666
1.2252 -0.9027
1.0016 -1.1062
1.4689 0.1506
1.3808 -2.1854
0.6564 -0.3663
-0.3188 -1.3690
0.9186 0.4180
0.9457 -0.8999
0.7406 0.3385
1.0302 -5.5230
0.8556 -2.1801
-0.0202 NaN
0.6013 -1.4619
0.2877 -2.1366
0.8248 -1.1598
0.4809 -0.0078
1.0184 -2.7062
-0.1584 -1.2824
0.3751 -1.7375
1.1171 -1.1804
0.7554 -2.4109
-0.0841 -1.3542
-0.5512 -2.2511
-0.5083 0.6821
-0.5324 -3.4422
-0.0031 -4.9069
-0.1887 -0.9987
0.2940 -3.2929
0.3095 -0.1119
-0.1968 NaN
0.7959 -2.1006
-0.3138 -1.4442
-0.5657 -2.7997
-0.5009 -1.4766
0.4286 -5.9956
-0.7273 -1.3121
-0.2579 -2.7221
0.5273 -1.3853
0.1238 -0.9059
0.3987 -6.3520
0.2423 NaN
-0.0565 -1.4976
-0.2496 -2.5330
0.0058 -1.6070
-0.0914 -2.1682
0.0641 -3.3242
-0.0490 -3.5676
-0.3858 -1.1664
-0.0840 -1.1610
-0.5133 1.1176
-0.0704 -3.9403
-0.2704 -0.4797
-0.3922 -0.8369
-0.3409 -0.5190
-0.1485 -0.5157
0.8376 -1.6012
0.8932 -0.8214
-0.0916 0.1417
0.4155 -3.8836
0.7287 -1.2501
0.8981 -0.3382
0.2843 -2.4276
0.2023 -1.5962
0.6258 -2.0508
0.7766 -2.0717
1.3231 0.2829
1.3043 NaN
0.4520 -0.5393
0.7249 -0.1975
-0.1968 -0.4461
0.2015 -4.5125
0.2891 -1.0493
0.1692 -0.4110
0.4995 -0.2946
0.6838 -1.0120
0.5122 -2.4485
0.1746 -2.1079
2.7786 2.2981
1.6221 -1.3682
1.3131 -1.1916
0.3543 -0.1228
0.6993 -1.9134
1.0836 -1.1151
1.9949 -0.0932
0.1317 -0.0963
2.1317 -3.4640
0.8263 -1.3359
1.7576 -3.2542
0.6173 NaN
0.9611 -0.6623
1.3373 -0.9917
0.3760 -1.8159
1.3505 -0.6379
0.7448 -1.0151
0.5857 -2.5806
1.6835 0.0680
0.3744 0.1108
1.7403 0.8857
0.3419 -0.0195
NaN -2.6076
0.8791 0.7341
2.7658 1.8072
1.9003 -1.4054
1.8141 -0.5358
0.5828 -1.6752
1.1411 -1.2395
0.1411 1.3425
0.1023 -3.9126
1.6040 -1.6096
1.8897 -2.0519
1.3641 -1.7285
2.4501 -1.9090
1.0087 -5.8151
0.7368 -2.3125
-0.0590 -3.5955
2.7889 -3.4589
-0.1802 -0.9789
1.0300 -1.5407
0.0753 -0.2788
-0.0071 -0.5563
2.9985 -0.7401
1.4668 -1.0108
0.3148 -1.1047
-0.2356 -1.5385
1.1109 0.3701
1.2683 -2.1609
0.8169 -5.6065
1.8608 0.0400
2.2071 -1.5310
1.1000 0.4773
2.2645 -5.9066
0.5449 -2.6489
0.3854 -4.7336
0.3259 -0.3168
0.3438 -2.9846
1.9679 -2.0883
0.5939 1.6384
1.3507 -7.2275
0.7494 2.9973
0.0491 -2.5903
-0.5989 -0.6833
-0.6269 -1.2571
-0.2336 -3.3118
-0.4400 -1.5700
0.8773 -0.7454
1.2539 -0.5627
2.2508 -1.3117
0.3083 -1.5654
1.6622 -6.5992
Shortening the two arrays creates a mismatch between the two time-series.
Thanks
@Stefano Grillini: you really have two choices: either interpolate to fill in the NaN data, or remove the entire row from your data wherever there is a NaN. Judging by your data interpolation does not make much sense, however removing the rows is easy:
idx = any(isnan(c),2);
new = c(~idx,:)
Thank you very much @Stephen! It's actually the only choice

Accedi per commentare.

Più risposte (1)

I'd say, the most straight forward method would be using cell to combine whatever dimension you have, and use Cell{a,b}(x,y) to access the elements.

Categorie

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by