Merging data from different matrices with different row numbers by matching date

5 visualizzazioni (ultimi 30 giorni)
I have 1600 matrices (1600 companies info) each containing data similar to the following (first column is company ID, second is a date, and third is return)
% code
10001 73500 0.05
10001 73440 0.06
10001 73320 0.04
10001 73280 0.03
..... ..... ....
The issue is each of these 1600 matrices have different row number depending on the availability of the data from the database (i.e return is available for different companies on different dates and different number of periods).
Now the task at hand is to merge all these matrices into one.
So I created a vector of all possible dates like in the following:
% code
73500
73480
73440
73380
73320
73300
73280
.....
I am looking to extract the data so that each company return is listed in a column at the right date (row), the rest of the rows should show Nan. Something like the following:
% code
73500 0.04 nan nan ........ for the rest of companies until column 1601
73480 nan nan 0.02
73440 0.05 0.04 nan
73380 0.03 0.05 0.02
73320 nan 0.02 nan
73300 0.05 nan nan
73280 nan nan 0.05
..... .... .... ....
Appreciate any help. Thanks
  3 Commenti
OCDER
OCDER il 26 Set 2017
I sense intersect will be involved to find matching indices of two matrices based on dates.

Accedi per commentare.

Risposta accettata

Jan
Jan il 26 Set 2017
Modificato: Jan il 26 Set 2017
result = nan(nDates, nCompanies + 1);
result(:, 1) = Dates;
for iComp = 1:nCompanies
aMatrix = <how to get your iComp.th matrix>
[iR, iM] = ismember(Dates, aMatrix(:, 2));
result(iR, iComp + 1) = aMatrix(iM, 3);
end
  2 Commenti
John
John il 26 Set 2017
Thank you Jan The code seem to be working except the last line
aMatrix(iM, 3)
throw the following error:
Subscript indices must either be real positive integers or logicals.
Any thoughts on that?
John
John il 26 Set 2017
Ok Jan. I figured it out. Just add
iM(iM==0)=[]
before that last line and it will work.
Thanks mate.

Accedi per commentare.

Più risposte (2)

KL
KL il 26 Set 2017
If you have R2016b or later, timetable does this with synchronize command,
timetable_summary = synchronize(timetable_company1,timetable_company2);

Peter Perkins
Peter Perkins il 27 Set 2017
If you are using R2016b or later, timetable provides two really good ways to approach this. The second approach also works with tables, which have been in MATLAB since R2013b.
Leaving aside the question of what exactly are those "dates" (days since??? datenums divided by 10?), try these.
1) Create separate timetables and synchronize
>> x1 = [10001 73500 0.05; ...
10001 73440 0.06; ...
10001 73320 0.04; ...
10001 73280 0.03];
>> x2 = [10002 73380 0.5; ...
10002 73320 0.6; ...
10002 73300 0.4; ...
10002 73280 0.3];
>> t1 = datetime(x1(:,2),'ConvertFrom','datenum');
>> CompanyA = timetable(x1(:,3),'RowTimes',t1,'VariableNames',{'Return'})
CompanyA =
4×1 timetable
Time Return
____________________ ______
27-Mar-0201 00:00:00 0.05
26-Jan-0201 00:00:00 0.06
28-Sep-0200 00:00:00 0.04
19-Aug-0200 00:00:00 0.03
>> t2 = datetime(x2(:,2),'ConvertFrom','datenum');
>> CompanyB = timetable(x2(:,3),'RowTimes',t2,'VariableNames',{'Return'})
CompanyB =
4×1 timetable
Time Return
____________________ ______
27-Nov-0200 00:00:00 0.5
28-Sep-0200 00:00:00 0.6
08-Sep-0200 00:00:00 0.4
19-Aug-0200 00:00:00 0.3
>> returns = synchronize(CompanyA,CompanyB,'union')
returns =
6×2 timetable
Time Return_CompanyA Return_CompanyB
____________________ _______________ _______________
19-Aug-0200 00:00:00 0.03 0.3
08-Sep-0200 00:00:00 NaN 0.4
28-Sep-0200 00:00:00 0.04 0.6
27-Nov-0200 00:00:00 NaN 0.5
26-Jan-0201 00:00:00 0.06 NaN
27-Mar-0201 00:00:00 0.05 NaN
2) Create one timetable (amd maybe stop there) and unstack
>> x = [x1; x2];
>> t = datetime(x(:,2),'ConvertFrom','datenum');
>> tt = timetable(x(:,1),x(:,3),'RowTimes',t,'VariableNames',{'Company' 'Return'});
>> tt.Company = categorical(tt.Company,[10001 10002],{'CompanyA' 'CompanyB'})
tt =
8×2 timetable
Time Company Return
____________________ ________ ______
27-Mar-0201 00:00:00 CompanyA 0.05
26-Jan-0201 00:00:00 CompanyA 0.06
28-Sep-0200 00:00:00 CompanyA 0.04
19-Aug-0200 00:00:00 CompanyA 0.03
27-Nov-0200 00:00:00 CompanyB 0.5
28-Sep-0200 00:00:00 CompanyB 0.6
08-Sep-0200 00:00:00 CompanyB 0.4
19-Aug-0200 00:00:00 CompanyB 0.3
>> returns = unstack(tt,'Return','Company')
returns =
6×2 timetable
Time CompanyA CompanyB
____________________ ________ ________
27-Mar-0201 00:00:00 0.05 NaN
26-Jan-0201 00:00:00 0.06 NaN
28-Sep-0200 00:00:00 0.04 0.6
19-Aug-0200 00:00:00 0.03 0.3
27-Nov-0200 00:00:00 NaN 0.5
08-Sep-0200 00:00:00 NaN 0.4

Categorie

Scopri di più su Dates and Time in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by