Fast subsetting or indexing of data

4 visualizzazioni (ultimi 30 giorni)
Louise Wilson
Louise Wilson il 29 Set 2020
Commentato: Rik il 30 Set 2020
I am working with large datasets which I am subsetting into various categories and saving as smaller files. What I am doing right now is working but it is quite time consuming and error prone, as it involved a lot of copy and paste.
For example, I have many files I have split into those with boats and those without boats. I then split those into season. Would there be a faster way to do this where I apply the same command to prescribed set of variables?
%% Comparisons... Season using water temp
boatsAbsent_t=boatsAbsent.Var1; %time variables
[BA_spring, BA_summer, BA_autumn, BA_winter]=indexSeasons(boatsAbsent_t); %index times into seasons
boatsPresent_t=boatsPresent.Var1;
[BP_spring, BP_summer, BP_autumn, BP_winter]=indexSeasons(boatsPresent_t);
%Subset PSD outputs and write to file
S=withtol(BA_spring,seconds(1));
BA_spring=boatsAbsent(S,:);
writetable(timetable2table(BA_spring),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Spring.csv')));
S=withtol(BA_summer,seconds(1));
BA_summer=boatsAbsent(S,:);
writetable(timetable2table(BA_summer),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Summer.csv')));
S=withtol(BA_autumn,seconds(1));
BA_autumn=boatsAbsent(S,:);
writetable(timetable2table(BA_autumn),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Autumn.csv')));
S=withtol(BA_winter,seconds(1));
BA_winter=boatsAbsent(S,:);
writetable(timetable2table(BA_winter),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Winter.csv')));
S=withtol(BP_spring,seconds(1));
writetable(timetable2table(BP_spring),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Spring.csv')));
S=withtol(BP_summer,seconds(1));
writetable(timetable2table(BP_summer),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Summer.csv')));
S=withtol(BP_autumn,seconds(1));
writetable(timetable2table(BP_autumn),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Autumn.csv')));
S=withtol(BP_winter,seconds(1));
writetable(timetable2table(BP_winter),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Winter.csv')));
  3 Commenti
Stephen23
Stephen23 il 29 Set 2020
Meta-data is data, and data does not belong in variable names! Sticking meta-data into variable names, e.g. the season names:
BA_spring, BA_summer, BA_autumn, BA_winter
means that you force yourself into writing slow, inefficient code or doing lots of copy-and-paste. Rik correctly recommends that you should put all of your data in arrays, rather than splitting into separated variables.
Louise Wilson
Louise Wilson il 29 Set 2020
Awesome, this helps a lot, thank you Stephen! I am glad I asked.

Accedi per commentare.

Risposta accettata

Rik
Rik il 29 Set 2020
Whenever you find yourself copy-pasting code in Matlab, you should consider an array.
seasons={'Spring','Summer','Autumn','Winter'};
boatsPresent_t=boatsPresent.Var1; %time variables
boatsAbsent_t=boatsAbsent.Var1; %time variables
BP=cell(1,4);BA=cell(1,4);
[BP{:}]=indexSeasons(boatsPresent_t); %index times into seasons
[BA{:}]=indexSeasons(boatsAbsent_t); %index times into seasons
for n=1:numel(seasons)
S=withtol(BP{n},seconds(1));
BP_part=boatsPresent(S,:);
writetable(timetable2table(BP_part),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_',seasons{n},'.csv')));
S=withtol(BA{n},seconds(1));
BA_part=boatsAbsent(S,:);
writetable(timetable2table(BA_part),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_',seasons{n},'.csv')));
end
If you have more states than just present and absent you should consider putting those states in an array so you can use it to generate logical indices.
  5 Commenti
Louise Wilson
Louise Wilson il 30 Set 2020
I think my issue is just how to name the variable the table is being stored in? BP.seasons{n} doesn't work.
Rik
Rik il 30 Set 2020
If you want to have a dynamic field name you need to use this syntax:
name='foo';
S.(name)='bar';
But what is wrong with the code you posted? You shouldn't be storing data (i.e. the season) in a variable name. If you do, that will cause the same issue every time you want to use the variables.

Accedi per commentare.

Più risposte (0)

Tag

Prodotti


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by