How to create a group summary of table such that it takes latest values ?

4 visualizzazioni (ultimi 30 giorni)
Hello All,
I am having some problems in creating a summary of table.
I will try to explain my problem in detail.
I have a table named "InputSessionTable" which is shown like this. (Current Input)
% Create table
CustomerName = ["Customer1" ; "Customer1" ; "Customer1" ; "Customer1" ; "Customer2" ; "Customer2" ];
ProjectName = ["Project1" ; "Project1" ; "Project1" ; "Project3" ; "Project2" ; "Project2" ];
VehicleName = ["Vehicle1" ; "Vehicle1" ; "Vehicle1" ; "Vehicle3" ; "Vehicle2" ;"Vehicle2" ];
VehicleType = ["VehicleType1" ; "VehicleType1" ; "VehicleType1" ; "VehicleType3" ; "VehicleType2" ; "VehicleType2"];
EngineerName = ["Engineer1" ; "Engineer1" ; "Engineer3" ; "Engineer3" ; "Engineer2" ; "Engineer2" ];
Date = [datetime(2023,12,27); datetime(2023,12,29); datetime(2023,12,30) ; datetime(2023,12,31); datetime(2023,12,28) ; datetime(2023,12,30)];
VerNo = [1 ; 2 ; 3 ; 1 ; 1 ;2 ];
InputSessionTable = table(CustomerName,ProjectName,VehicleName,VehicleType,EngineerName,Date,VerNo);
InputSessionTable =
6×7 table
CustomerName ProjectName VehicleName VehicleType EngineerName Date VerNo
____________ ___________ ___________ ______________ ____________ _____________ _____
"Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer1" "27-Dec-2023" 1
"Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer1" "29-Dec-2023" 2
"Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer3" "30-Dec-2023" 3
"Customer1" "Project3" "Vehicle3" "VehicleType3" "Engineer3" "31-Dec-2023" 1
"Customer2" "Project2" "Vehicle2" "VehicleType2" "Engineer2" "28-Dec-2023" 1
"Customer2" "Project2" "Vehicle2" "VehicleType2" "Engineer2" "30-Dec-2023" 2
This table is created from the files I am having in the directory.
Here Customer Name, Project Name, Vehicle Name and Vehicle Type are the unique characteristics of the file.
If you observe first 3 rows that Customer Name, Project Name, Vehicle Name and Vehicle Type is same but their Version number is updated to 3.
Like Wise for row 5 and 6 Version number is updated to 2.
Where as row 4 its unique and does not any newer versions.
I am trying to find a way such that it create below table with "OutputSessionTable". (Desired Output)
OutputSessionTable =
3×7 table
CustomerName ProjectName VehicleName VehicleType EngineerName Date VerNo
____________ ___________ ___________ ______________ ____________ _____________ _____
"Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer3" "30-Dec-2023" 3
"Customer1" "Project3" "Vehicle3" "VehicleType3" "Engineer3" "31-Dec-2023" 1
"Customer2" "Project2" "Vehicle2" "VehicleType2" "Engineer2" "30-Dec-2023" 2
Now if you observe it takes the latest version of each unique project. This eliminates the rows with same unique chaterstics but older version number.
I have tried to use a function called groupsummary. This helps me in creating a unique chatertics of table.
GroupSummaryTable = groupsummary(InputSessionTable,1:4,"max","VerNo")
When using this function I get an putput like this.
GroupSummaryTable =
3×6 table
CustomerName ProjectName VehicleName VehicleType GroupCount max_VerNo
____________ ___________ ___________ ______________ __________ _________
"Customer1" "Project1" "Vehicle1" "VehicleType1" 3 3
"Customer1" "Project3" "Vehicle3" "VehicleType3" 1 1
"Customer2" "Project2" "Vehicle2" "VehicleType2" 2 2
Now here I am stuck that how can I include Engineer Name and Date column with this table in order to get "OutputSessionTable". Those are in string and datetime format.
Let me know your view on this.
Thanks in Advance
  1 Commento
Stephen23
Stephen23 il 4 Gen 2024
Modificato: Stephen23 il 4 Gen 2024
It appears that GROUPSUMMARY cannot do this.
You could try:
  • GROUPFILTER
  • SORT, UNIQUE, then indexing.
If you want more help upload a sample table in a mat file, by clicking the paperclip button.

Accedi per commentare.

Risposta accettata

Dyuman Joshi
Dyuman Joshi il 4 Gen 2024
Modificato: Dyuman Joshi il 4 Gen 2024
It is helpful to provide/attach data (or a sample of it) with the question, so it is easier for us to work with.
Also, In general, it is better to use the datetime data type to represent any data with dates (or time or a mix of both). However, if you want to keep it as string, you can remove that line.
%Data for table
CustomerName = "Customer" + [1 1 1 1 2 2].';
ProjectName = "Project" + [1 1 1 3 2 2].';
VehicleName = "Vehicle" + [1 1 1 3 2 2].';
VehicleType = "VehicleType" + [1 1 1 3 2 2].';
EngineerName = "Engineer" + [1 1 3 3 2 2].';
Date = ["27-Dec-2023" "29-Dec-2023" "30-Dec-2023" "31-Dec-2023" "28-Dec-2023" "30-Dec-2023"].';
VerNo = [1 2 3 1 1 2].';
T = table(CustomerName,ProjectName,VehicleName,VehicleType,EngineerName,Date,VerNo);
%Convert the Date variable to datetime() data type
%convertvars() can also be used for the same
T.Date = datetime(T.Date, 'InputFormat', 'dd-MMM-yyyy');
%% Sort the rows of the table first by VerNo then by Date, in descending order
T = sortrows(T, {'VerNo', 'Date'}, 'descend')
T = 6×7 table
CustomerName ProjectName VehicleName VehicleType EngineerName Date VerNo ____________ ___________ ___________ ______________ ____________ ___________ _____ "Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer3" 30-Dec-2023 3 "Customer2" "Project2" "Vehicle2" "VehicleType2" "Engineer2" 30-Dec-2023 2 "Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer1" 29-Dec-2023 2 "Customer1" "Project3" "Vehicle3" "VehicleType3" "Engineer3" 31-Dec-2023 1 "Customer2" "Project2" "Vehicle2" "VehicleType2" "Engineer2" 28-Dec-2023 1 "Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer1" 27-Dec-2023 1
%% Get the indices of the first occurence of the unique elements in same order
[~, idx] = unique(T.VerNo, 'stable')
idx = 3×1
1 2 4
%% Output
GsT = T(idx, :)
GsT = 3×7 table
CustomerName ProjectName VehicleName VehicleType EngineerName Date VerNo ____________ ___________ ___________ ______________ ____________ ___________ _____ "Customer1" "Project1" "Vehicle1" "VehicleType1" "Engineer3" 30-Dec-2023 3 "Customer2" "Project2" "Vehicle2" "VehicleType2" "Engineer2" 30-Dec-2023 2 "Customer1" "Project3" "Vehicle3" "VehicleType3" "Engineer3" 31-Dec-2023 1
Note that the order of rows obtained above is according to the descending values of VerNo.
  1 Commento
Jay
Jay il 4 Gen 2024
Thankyou for your answer.
It is working as expected.
Sorry for not attaching the table earlier. I have not added the code for creating the table.
I am using date in datetime format. But this date is also a part of filename which I am creating from the table. As this helps in creating a unique filename where I was converting datetime to string.
I had tried a bit of long way achive the answer. I am not sure if this would be helpful or not.
GroupSummaryTable = groupsummary(InputSessionTable,1:4,"max","VerNo");
Unrecognized function or variable 'InputSessionTable'.
Data = table();
for i = 1:height(GroupSummaryTable)
for j = 1:height(InputSessionTable)
if isequal(GroupSummaryTable.CustomerName(i),InputSessionTable.CustomerName(j))
if isequal(GroupSummaryTable.ProjectName(i),InputSessionTable.ProjectName(j))
if isequal(GroupSummaryTable.VehicleName(i),InputSessionTable.VehicleName(j))
if isequal(GroupSummaryTable.VehicleType(i),InputSessionTable.VehicleType(j))
if isequal(GroupSummaryTable.max_VerNo(i),InputSessionTable.VerNo(j))
Data = [Data;InputSessionTable(j,:)];
end
end
end
end
end
end
end

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Time Series Objects in Help Center e File Exchange

Prodotti


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by