Deleted table rows stuck in memory? Cannot fit a linear model.
Mostra commenti meno recenti
I have a data table with both continuous and categorical values. I want to run a linear model for this table using 'fitlm'. I have a loop where I pick a different subset of rows and fit a model for it.
However, it appears that I cannot do slicing for categorical variables. Fitlm sees all possible categories of the full table and complains "Warning: Regression design matrix is rank deficient to within machine precision.". The non-existing categories also appear in the model. Even creating a temporary table from numerical matrix does not help!
Here is an example. I don't understand why one categorical factor (condition 3) won't go away.
% data with 3 categories
data = [...
1.9,1;
5.7,2;
0.7,1;
2.2,2;
0,1;
1.9,2;
-0.2,1;
1.6,2;
-0.7,1;
2.3,2;
1,3];
% create table
data_table = array2table(data,'VariableNames',{'Y','Condition'});
% make condition as categorical
data_table.Condition=categorical(data_table.Condition);
% fit linear model (basically a t-test)
model1 = fitlm(data_table,'Y ~ 1 + Condition');
% this works, but condition 3 is basically useless with only 1 sample
% Lets remove the final row and condition 3
data_table = data_table(1:end-1,:);
% repeat with sliced table (only 2 categories remains)
model2 = fitlm(data_table,'Y ~ 1 + Condition');
% We get a warning. Condition 3 is still there with no data.
% Create a new table from a numerical array
mat = table2cell(data_table);
new_data_table = cell2table(mat,'VariableNames',{'Y','Condition'});
new_data_table.Condition=categorical(new_data_table.Condition);
% no category 3 in the new table
model3 = fitlm(new_data_table,'Y ~ 1 + Condition');
% still the same warning even if there never was condition 3 in this table
% ok, lets clear old tables and start from the cell matrix
clear data_table new_data_table data;
new_new_data_table = cell2table(mat,'VariableNames',{'Y','Condition'});
new_new_data_table.Condition=categorical(new_new_data_table.Condition);
% again, no category 3 in the new table
model4 = fitlm(new_new_data_table,'Y ~ 1 + Condition');
% still the same warning, condition 3 remains
ADDITION:
In the latest version of Matlab I could probably use "removecats" to delete non-existing categories. However, this function is not available in r2017b.
Risposta accettata
Più risposte (0)
Categorie
Scopri di più su Categorical Arrays in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!