How to split a string and make multiple rows from the original data for each split?
6 views (last 30 days)
I have an excel sheet with some students data and it has 11 columns and about 500 rows. I want to manipulate data in the grade and class columns.
The current format is the grade column is written in A,B,D,C format and class is written Chem, Math,Eng, Sci format.
Assuming the order is correct,
I want to split the grades and the classes at the comma and make each grade and class in a new row and duplicating the rest of the data like the name, count etc.
For example: the first row has Smith and his grades A,B,D,C as shown below
I want it to make 4 rows for Smith, each row represent one grade and one class like this
data = readtable("New Microsoft Excel Worksheet.xlsx")
convList = ;
for i = 1:length(Conv)
convList = [convList; split(Conv(i), ",")];
convList2 = ;
for i = 1:length(Conv2)
convList2 = [convList2; split(Conv2(i), ",")];
This code will split string at the comma and I'm stuck trying to duplicate the data for each run. I would really appreciate any help
Karim on 16 Aug 2022
Edited: Karim on 16 Aug 2022
Hi, there are several methods to do this. One way is to create an index vector and use this to copy the variable. See below :)
data = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/1099480/New%20Microsoft%20Excel%20Worksheet.xlsx");
% split the grades and class into columns
Grades_split = split(string(data.Grades),',');
Class_split = split(string(data.Class),',');
% detele spaces in the 'class' strings
Class_split = strtrim(Class_split);
% reshape into a row vector
Grades_split = reshape(pagetranspose(Grades_split),,1);
Class_split = reshape(pagetranspose(Class_split),,1);
% create an index vector
MyIndex = repmat(1:height(data),4,1);
MyIndex = reshape(MyIndex,,1);
% create the new table based on the index
NewTable = data(MyIndex,:);
% insert splitted data
NewTable.Grades = Grades_split;
NewTable.Class = Class_split;
More Answers (1)
Cris LaPierre on 16 Aug 2022
I think you could get there using the stack function. There is some preprocessing needed first, namely splitting the Grades into their own variables, but that can be done using split and splitvars. To work with strings, set your 'TextType' property to 'string' in readtable.
Here is a brief example.
data = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/1099480/New%20Microsoft%20Excel%20Worksheet.xlsx",...
data.Grades = split(data.Grades,",")
dataNew = splitvars(data,'Grades')
dataFinal = stack(dataNew,["Grades_"+(1:4)])