Remove columns from cell array when several cells in such columns have zero-values

Hi all,
I use R2019b. I have a cell array (11 x 400) in which the first row and first column contain words (e.g. 'Person#001', 'Person#002', ... 'Person#399' for the first row and 'ITEM1', 'ITEM2', ... 'ITEM10' for the first column). The remainder of the cells (2:end,2:end) are occupied by numerical values that range from 0 to 1. This dataset represents the amounts of each of the 10 items belonging to each of the 399 persons.
Some of these persons have only 0 values for all of their respective numerical values. I am trying to remove the columns from the cell array that contain only zero numerical values in rows 2 through 11 (i.e. for all 10 items). In the example below (that represents a condensed version of my data), I want the columns with 'Person#004' and 'Person#006' at the top to be removed entirely. There are several of these all-zero columns throughout the entire dataset and ideally this script would remove all of them.
Thanks in advance for your help!
BEFORE
'Person#001' 'Person#002' 'Person#003' 'Person#004' 'Person#005' 'Person#006' . . . 'Person#399'
'ITEM1' 0 0.2 0.4 0 0.1 0 0.4
'ITEM2' 0 0 0.2 0 0 0 0.1
'ITEM3' 0.5 0 0.1 0 0 0 0
'ITEM4' 0 0.1 0 0 0 0 0.3
AFTER
'Person#001' 'Person#002' 'Person#003' 'Person#005' . . . 'Person#399'
'ITEM1' 0 0.2 0.4 0.1 0.4
'ITEM2' 0 0 0.2 0 0.1
'ITEM3' 0.5 0 0.1 0 0
'ITEM4' 0 0.1 0 0 0.3

 Risposta accettata

cellArray is your cell array.
threshold is the percent of 0-values per column that defines "several".
See inline comments for details.
% Define a threshold that defines "several".
% For example. 0.5 means 50% of the values in a column is 0 (ignoring 1st row)
% 1.0 means 100% of the values in a column is 0 (ignoring 1st row)
threshold = 0.5;
% Compute the percent of 0-values per column in rows 2:end (pZero)
isZero = cellfun(@(x) isnumeric(x) && x==0, cellArray(2:end,:)); % <--- UPDATED FROM ORIGINAL ANSWER
pZero = mean(isZero,1);
% Remove columns with "several" zeros
cellArray(:,pZero >= threshold) = [];
If you want to retain the original cell array, copy it to a different variable before the final line.

4 Commenti

Thank you Adam for helping me out - I really appreciate it! I am finding that this script removes the column that precedes the all-zero column in cellArray. pZero stores the values in columns that are 1 integer less than the cellArray column containing all-zeros. e.g. person#145 has all zeros and is found in column 146 of cellArray, pZero stores it in column 145, and then column 145 (representing person#144) gets eliminated in cellArray. Do you have any suggestions to fix this? Thank you.
I see that your first column contains strings which would cause an error when applied to my original answer.
I'm guessing that you fixed that error by changing
isZero = cellfun(@(x) x==0, cellArray(2:end,:));
% ^^^^^^^^^^^^^^^^^^
to
isZero = cellfun(@(x) x==0, cellArray(2:end,2:end));
% ^^^^^^^^^^^^^^^^^^^^^^
which would produce the results you're describing.
Instead, to fix the error, change that line to
isZero = cellfun(@(x) isnumeric(x) && x==0, cellArray(2:end,:));
I'll update my answer to reflect this improvement.
Thank you so much Adam. I'm now getting what I was hoping for. I am still getting a grasp of MATLAB so I am grateful for the support!

Accedi per commentare.

Più risposte (0)

Categorie

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by