How to search for channel name and numerical data in resulting struct after importing multiple data files?
5 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Questions:
1.) In avoiding using eval to dynamically name variables, how do I search a resulting struct of data and labels to link a channel name to a data column and then analyze multiple channels all having the same name?
2.) How do I properly write an if or switch case statement to deal with importing a single or multiple data files when the resulting workspace object is either a character array for a single file or a cell array for multiple files?
Background:
Currently using Matlab R2014b. I'm trying to write a script to select which data files, import / load the data, and place the data into a matrix or array or struct or whatever is most useful and appropriate for signal analysis, processing, and plotting afterward.
My data files are an export from a data acquisition tool (ATI VISION). The generated .mat file creates one cell array and one matrix. The cell array contains the text names of the data channels. This is n x 1 in size, where n is the number of channels exported. The matrix contains the numerical data and is m x n in size, where n is again the number of exported data channels and m is the number of samples.
The cell array of names has nearly zero consistency in the organization of the names (not alphabetical, not any order representing a channel number in the recording tool). The only consistency is that the nth row in the cell array shows no name "[]" but is always the "time" channel, and this always corresponds to column 1 in the data matrix. I've attached two mat files for reference. You can see in one file the first three rows are 'AngleSlipPoint2', 'AngleSlipPoint1', and 'AngleSlip', but in the other file the first three rows are 'PosLon', 'FRSpeed', and 'AccActPos'. I know my script can't be as easy as always accessing column 2 for x data and column 6 for y data. I need to search the cell array to link a name to a column in each individual data file.
In the two weeks that I've now been teaching myself how to write scripts and analyze data with Matlab I've apparently learned to do things the ill-advised way. The first obstacle I tried to address was aligning the data name in the cell array with the appropriate column in the data matrix. Because the cell array appears to be an array of text and not characters or strings I could find no other way to pull out the names, link them to data, and generate a variable than to use the frequently unrecommended eval function.
%% Extract Data
uiload
NumVars = numel(Data_Labels); % Establish number of variables to be created.
Time = Data(:,1); % Time values are always the first column of the Data matrix, so it's easy to define and create.
for k = 1:NumVars-1 % Time already created and exists as final channel, so we only need to generate variables for the remaining n-1 variables.
eval([Data_Labels{k},'=[Data(:,k+1)]']); % Extract variable names and populate with data in workspace.
end
This worked for a single data file as it gets me a workspace full of variables and correctly populates them with the numerical data. I can integrate, derive, filter, and plot whatever I want. It fails miserably as soon as I attempt to load a second file as the next import will overwrite everything created from the first file. Hard to compare longitudinal acceleration in 2wd and 4wd when the newest import overwrites the old, and it's understandably stupid to write the script to append a 1/2/3 to the end of the name so I can have multiple instances in the workspace.
This is what I've come up with for importing multiple files. Still using the two attached files as my test files for writing the script.
%% Select files for 2WD analysis
[selected2wdFiles,pathName2wd] = uigetfile('*.mat','Select 2WD data files for analysis','MultiSelect','on');
if isequal(selected2wdFiles, 0)
disp('No Files Selected')
return;
end
for m = 1:length(selected2wdFiles)
data2wd(m) = load(fullfile(pathName2wd, selected2wdFiles{m}));
end
%% Select files for 4WD analysis
[selected4wdFiles,pathName4wd] = uigetfile('*.mat','Select 4WD data files for analysis','MultiSelect','on');
if isequal(selected4wdFiles, 0)
disp('No Files Selected')
return;
end
for n = 1:length(selected4wdFiles)
data4wd(n) = load(fullfile(pathName4wd, selected4wdFiles{n}));
end
This generates two structs, data2wd and data4wd, which contain the loaded cell arrays and data matrices. Unfortunately this script only works if I am selecting multiple files. If I only select one file it fails because the resulting item is a character array instead of a cell array. I haven't tried to script around that, but I suppose a switch case or if statement should work. Question #2 above...any suggestions?
The next step / steps is where I am lost. I believe I have avoided dynamically named variables, but I don't know how to go about extracting my longitudinal acceleration data from each data set. The specific channel name in the cell array of text is going to be 'AccelForward'. I know I need to search the cell array in row 1, column 2 of the struct to find the row number containing that name. This will tell me which column to access in the matrix stored in row 1, column 1 of the struct. Because it is a cell array of text the strfind command doesn't work. They aren't strings. Similarly they aren't characters either, so the related char commands don't work. Without using eval to extract things, how do I go about searching an array of text?
Once I can find the name, identify the data column, and the locate the actual data, how do I manipulate it without falling back on dynamically named workspace variables? I feel like I'm going to end up with pulling these columns of data back into the workspace as AccelForward_1, AccelForward_2, etc. and then more complicated and dynamic because I will have 2wd and 4wd data being compared and plotted against eacy other. What's the correct way to identify the data, manipulate the data, store the new data, and then access it later for plotting? Do I just keep generating more structs or arrays or matrices to stuff the data into and avoid a ridiculous workspace full of variables?
Now that I'm done writing a novel I suppose I simply don't know what I don't know and it makes it difficult to search and find answers. If anyone can put some labels on the forks in the road and send me in a useful direction I'd appreciate it. Thank you.
1 Commento
Stephen23
il 5 Apr 2019
"The only consistency is that the nth row in the cell array shows no name "[]" but is always the "time" channel, and this always corresponds to column 1 in the data matrix."
Ouch!
Risposta accettata
Stephen23
il 5 Apr 2019
Modificato: Stephen23
il 5 Apr 2019
You are right to avoid dynamically accessing variable names (e.g. using eval, assignin, evalin, and load without an output variable). Read this to know some of the reasons why:
Here is one simple solution for your task, using a non-scalar structure and dynamic fieldnames:
Using structure fields makes the order of the columns in the numeric matrix totally irrelevant.
[F,P] = uigetfile('*.mat','2WD','MultiSelect','on');
if isnumeric(F)
error('User quit')
elseif ischar(F)
F = {F};
end
S = struct('filename',F);
for ii = 1:numel(F)
T = load(fullfile(P,F{ii}));
L = [{'Time'};T.Data_Labels(1:end-1)]; % fix "Time" column mismatch
for jj = 1:numel(L)
S(ii).(L{jj}) = T.Data(:,jj);
end
end
The imported data is very easy to access in the structure, you only need to refer to the indices (corresponding to each file) and the fieldnames (corresponding to each data column), e.g:
>> S(1).filename
ans =
MKZ_2WD_LevelSnowAccel.mat
>> S(1).AccelForward([1:4,end-4:end])
ans =
-0.18
-0.18
-0.18
-0.18
... lots of lines
-1.92
-1.72
-1.2
-2.24
-4.1
>> S(1).Time([1:4,end-4:end])
ans =
-5.1505
-5.1405
-5.1305
-5.1205
... lots of lines
23.01
23.02
23.03
23.04
23.05
>> S(2).filename
ans =
MKZ_4WD_LevelSnowAccel.mat
>> S(2).AccelForward([1:4,end-4:end])
ans =
-0.07
-0.07
-0.07
-0.07
... lots of lines
0.23
0.19
0.3
0.33
0.27
>> S(2).Time([1:4,end-4:end])
ans =
-5.3711
-5.3611
-5.3511
-5.3411
... lots of lines
24.069
24.079
24.089
24.099
24.109
You could also do something similar with tables, timetables, or by rearranging the columns of the numeric array to have the same order.
2 Commenti
Più risposte (1)
Guillaume
il 5 Apr 2019
Considering that one of the variable is time, you may be better off storing your data in a timetable rather than a structure
The principle would be the same, use the cell array of names to name the variables instead of fields.
I'm a bit confused about one thing. If the time is the first column of the matrix, why is it the last element of the cell array. Is the array of name reversed with regards to the data column or does Data_Label(1:end-1) correspond to Data(2:end)?
I'm assuming the time is in seconds:
filepath = 'MKZ_2WD_LevelSnowAccel.mat'; %obtained however you want, with uigetfile for eg.
filecontent = load(filepath);
signals = array2timetable(filecontent.Data(:, 2:end), 'RowTimes', seconds(filecontent.Data(:, 1)), 'VariableNames', filecontent.Data_Labels(1:end-1));
If you want to import multiple files, you can store each timetable in a cell array, or vertically concatenate them into one big timetable. For that, I'd add a column indicating which source file each row came from. The order of the variables in a table does not have to be the same when you vertically concatenate tables, so the mismatched ordering wouldn't be an issue.
6 Commenti
Guillaume
il 5 Apr 2019
the original question mentions "Currently using Matlab R2014b..."
That, I did indeed miss in the wall of text (and the fact that the Release was tagged, I should have looked at that).
Yes the columns and names do match, just offset by 1 due to the time data being column 1 yet row n in the array of names
Then, both answers account for that. The timetable or structure use the names in whichever order they come to name the matching column.
Neither timetables or structures care about the ordering of the fields/variables when you operate on them (well as long as you are using the names and not numeric indices), so it does not matter if they're not in the same order from file to file.
%tables work the same way as timetables
t1 = array2table(rand(10, 3), 'VariableNames', {'Speed', 'Slip', 'Pitch'})
t1.Slip %will return the 2nd column of the table
t2 = arrat2table(rand(10, 3), 'VariableNames', {'Pitch', 'Speed', 'Slip'})
t2.Slip %will return the 3rd column of the table
Vedere anche
Categorie
Scopri di più su Tracking and Sensor Fusion in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!