multiple condition sumif without loop
9 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
I have the following table and want to sum over unique dates and Accounts to get the daily changes per account
TransactionID = {'102001';'102002';'102003';'102004';'102005';'102006';'102007';'102008';'102009';'102010'};
Date = datetime({'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-09';'2021-02-09';'2021-02-09';'2021-02-17'},"InputFormat","uuuu-MM-dd");
AmountDebited = [3000; 0; 0; 3000; 4443; 2.02877864; 430; 50; 200; 0.297598984636403];
AccountDebited = ["TD-CAD";"LC-CAD";"LC-CAD";"SI-CAD";"SP-CAD";"LC-USD";"SP-CAD";"SP-CAD";"SP-CAD";"LC-USD"];
AmountCredited = [3000; 0; 0; 3000; 2.03048665; 4450; 0.18713085; 0.02177471; 0.0869854; 0.295498984636403];
AccountCredited = ["SP-CAD"; "MM-LPT"; "KN-DIR"; "SP-CAD"; "LC-USD"; "SI-CAD"; "LC-USD"; "LC-USD"; "LC-USD"; "MM-USD"];
X = table(TransactionID, Date, AmountDebited, AccountDebited, AmountCredited, AccountCredited);
Right Now I can do it using the following for loop to compute the result but My actual table has thousands of records and the calculation takes forever so have been trying to get it to work using indexes but had no luck so far.
UniqueDebit = unique(X.AccountDebited);
UniqueCredit = unique(X.AccountCredited);
UniqueConc = unique([UniqueDebit;UniqueCredit]);
UniqueDates = unique(X.Date);
for i = 1:length(UniqueDates)
for j = 1:height(UniqueConc)
RowsCredited = find(X.Date == UniqueDates(i) & X.AccountCredited == UniqueConc(j));
RowsDebited = find(X.Date == UniqueDates(i) & X.AccountDebited == UniqueConc(j));
DailyCredited = sum(X.AmountCredited(RowsCredited));
DailyDebited = sum(X.AmountDebited(RowsDebited));
DailyChanges = DailyCredited - DailyDebited;
XMatrix(i,j) = DailyChanges;
end
end
XTable=array2table(XMatrix);
XTable.Properties.VariableNames = UniqueConc;
XTable.Properties.RowNames = string(UniqueDates);
0 Commenti
Risposta accettata
Jan
il 26 Mag 2022
Modificato: Jan
il 26 Mag 2022
Start with avoiding repeated work:
XMatrix = zeros(length(UniqueDates), height(UniqueConc)); % Pre-allocate!!!
xCredit = X.AccountCredited; % Abbreviation
xDebit = X.AccountDebited;
for i = 1:length(UniqueDates)
tmp = (X.Date == UniqueDates(i)); % Move out of the inner loop
for j = 1:height(UniqueConc)
RowsCredited = (tmp & xCredited == UniqueConc(j));
RowsDebited = (tmp & xDebited == UniqueConc(j));
DailyCredited = sum(X.AmountCredited(RowsCredited)); % [EDITED]
DailyDebited = sum(X.AmountDebited(RowsDebited)); % [EDITED]
XMatrix(i,j) = DailyCredited - DailyDebited;
end
end
Avoid the find(), because logical indexing is faster.
Is XMatrix preallocated?
3 Commenti
Jan
il 26 Mag 2022
Modificato: Jan
il 26 Mag 2022
It is hard to optimize code without having realistic input data. I cannot estimate, if an accumarray approach is faster here. Are you sure, that the loop is the bottleneck?
Another idea is to use ismember once, such that Matlab can search in numerical indices instead of string arrays:
xAccDebit = X.AccountDebited;
xAccCredit = X.AccountCredited;
xAmDebit = X.AmountDebited;
xAmCredit = X.AmountCredited;
uDebit = unique(xAccDebit);
uCredit = unique(xAccCredit);
uConc = unique([uDebit; uCredit]);
[uDates, ~, uDateInd] = unique(X.Date);
[~, creditInd] = ismember(xAccCredit, uConc);
[~, debitInd] = ismember(xAccDebit, uConc);
X = zeros(numel(uDates), numel(uConc)); % Pre-allocate
for i = 1:numel(uDates)
tmp = (uDateInd == i); % same as: (X.Date == uDates(i));
mCreditInd = creditInd .* tmp; % mask FALSE in tmp as 0 in vector
mDebitInd = debitInd .* tmp; %
for j = 1:numel(uConc)
RowsCredited = (mCreditInd == j);
RowsDebited = (mDebitInd == j);
DailyCredited = sum(xAmCredit(RowsCredited));
DailyDebited = sum(xAmDebit(RowsDebited));
X(i,j) = DailyCredited - DailyDebited;
end
end
Is numel(uDates) much larger than numel(uConc)? Maybe swapping the loops is faster.
I've edited the code in my first answer: I've abbreviated "X.AccountDebited" and "X.AmountDebited" both by "xDebited".
Are you working with less than 65536 elements? Then it might save some time, if you convert mCreditInd and mDebitInd to UINT16 and the loop e.g. to:
for j = uint16(1):uint16(numel(uConc))
Più risposte (1)
Seth Furman
il 2 Giu 2022
It's worth noting that a lot of these computations can be expressed more succinctly using groupsummary, outerjoin, and fillmissing.
TransactionID = {'102001';'102002';'102003';'102004';'102005';'102006';'102007';'102008';'102009';'102010'};
Date = datetime({'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-09';'2021-02-09';'2021-02-09';'2021-02-17'},"InputFormat","uuuu-MM-dd");
AmountDebited = [3000; 0; 0; 3000; 4443; 2.02877864; 430; 50; 200; 0.297598984636403];
AccountDebited = ["TD-CAD";"LC-CAD";"LC-CAD";"SI-CAD";"SP-CAD";"LC-USD";"SP-CAD";"SP-CAD";"SP-CAD";"LC-USD"];
AmountCredited = [3000; 0; 0; 3000; 2.03048665; 4450; 0.18713085; 0.02177471; 0.0869854; 0.295498984636403];
AccountCredited = ["SP-CAD"; "MM-LPT"; "KN-DIR"; "SP-CAD"; "LC-USD"; "SI-CAD"; "LC-USD"; "LC-USD"; "LC-USD"; "MM-USD"];
X = table(TransactionID, Date, AmountDebited, AccountDebited, AmountCredited, AccountCredited);
Find total amounts credited/debited for each date-account pair
credited = groupsummary(X,["Date","AccountCredited"],"sum",["AmountCredited"])
debited = groupsummary(X,["Date","AccountDebited"],"sum",["AmountDebited"])
Match total amounts credited/debited by date-account pair
credited.GroupCount = [];
debited.GroupCount = [];
sums = outerjoin(credited,debited,LeftKeys=["Date","AccountCredited"],RightKeys=["Date","AccountDebited"],MergeKeys=true)
Fill missing data
sums = fillmissing(sums,"constant",0,DataVariables=[3 4])
Compute net change by date-account pair
sums.NetChange = sums.sum_AmountCredited - sums.sum_AmountDebited
0 Commenti
Vedere anche
Categorie
Scopri di più su Loops and Conditional Statements in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!