Using accumarray to organize large CSV
5 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
I have a CSV with 4 columns, containing several thousands of rows of data. I have attached a sample data sheet with a small amount of data as an example.
I'm trying to write a code to import the data in a way that will be easy to create scatter plots/visualize the data, as well as calculate some basic stats (mean, median, std. dev., etc).
I want to group the data based on TWO of the columns: SITE NUMBER and DEPTH
The other two columns, d18O and date are what I need to plot/visualize
Essentially, my goal is to have separate variables or columns in a table that show d18O and DATE for each SITE NUMBER and DEPTH
I started using accumarray to accomplish this but I am having issues with the arrays, they are not showing up with all of the unique rows of data
T = readtable("data1.csv", 'VariableNamingRule','preserve');
[G,ID] = findgroups(T.("Site number"));
data = accumarray(G, T{:,4},[], @(x){T{x,:}});
0 Commenti
Risposta accettata
dpb
il 22 Set 2023
Modificato: dpb
il 22 Set 2023
tT=readtable("data1.csv");
tT.Properties.VariableNames(2)={'Site'}; % shorten to be more convenient to use
head(tT)
G=grpstats(tT,{'Site','Depth'},{'mean','median','std'},'DataVars',{'Date','d18O'})
hSc=rowfun(@doit,tT,'GroupingVariables',{'Site','Depth'},'InputVariables',{'Date','d18O'},'OutputFormat','uniform');
function h=doit(x,y)
figure
h=scatter(x,y,'filled');
xlabel('Date'),ylabel('d18O')
end
2 Commenti
dpb
il 22 Set 2023
NOTA BENE that you can get much more creative with the user function...
tT=readtable("data1.csv");
tT.Properties.VariableNames(2)={'Site'}; % shorten to be more convenient to use
hSc=rowfun(@doit,tT,'GroupingVariables',{'Site','Depth'},'InputVariables',{'Date','d18O','Site','Depth'},'OutputFormat','uniform');
function h=doit(x,y,s,d)
figure
h=scatter(x,y,'filled');
xlabel('Date'),ylabel('d18O')
legend(compose('Site %3d Depth %3d',s(1),d(1)))
end
Più risposte (1)
Voss
il 21 Set 2023
Modificato: Voss
il 21 Set 2023
"my goal is to have separate variables or columns in a table that show d18O and DATE for each SITE NUMBER and DEPTH"
Something like this?
T = readtable("data1.csv", 'VariableNamingRule','preserve');
disp(T);
gvars = ["Site number","Depth"];
vars = setdiff(T.Properties.VariableNames,gvars);
T_summary = groupsummary(T,gvars,@(x){x});
T_summary = removevars(T_summary,"GroupCount");
T_summary = renamevars(T_summary,"fun1_"+vars,vars)
T_summary.Date{1}
T_summary.d18O{1}
1 Commento
Voss
il 22 Set 2023
Or maybe it's more convenient just to sort the table so that rows with the same Site number and Depth are together:
T = readtable("data1.csv", 'VariableNamingRule','preserve');
disp(T);
T_sorted = sortrows(T,["Site number","Depth"]);
disp(T_sorted);
Vedere anche
Categorie
Scopri di più su Tables in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!











