Removing empty cells with non-zero dimensions

My code needs to deal with a cell array X, each cell of which is itself a cell array, containing a double array. For example, X could look as follows:
X = cell(N,1);
for i=1:N
X{i}=cell(1,10);
for j=1:10
X{i}{j} = randi(10, 5,2); %each cell contains a double array of size (5,2)
end
end
While manipulating my code, some rows of these double arrays might get removed. For example:
for i=1:N
for j=1:10
X{i}{j}(X{i}{j}(:,1) < 3,:) = [];
end
end
In some cases, all elements of some double arrays get removed, resulting in a 0×2 empty double matrix. This nonzero size is causing problems elsewhere in my code, how do I efficiently replace these with empty arrays?
My current approach is to call the following forloop after each set of manipulatoins that might result in empty arrays with nonzero size.
for i=1:N
for j=1:10
if isempty(X{i}{j})
X{i}{j} = [];
end
end
end
However, I'm fairly certain that there is no better way of doing this. Any suggestions?
Edit: I want to emphasize that I do not want to remove the empty cells. What I do want is to replace any 0x2 empty double matrices with 0x0 matrices.
The 10 cells inside each X{i} represent "physical" lattice sites in my simulation. An empty cell does have a meaning, and should not be removed.

3 Commenti

That would just change the empty cell from a 0x2 to a 0x0. Is your goal to remove the empty cells completely? Note that the 2nd layer of cells may no longer all be the same length.
No, I explicitly want to keep the empty cells, I just don't want them to have a non-zero size if they are empty.
The 10 cells inside each X{i} represent "physical" lattice sites in my simulation. An empty cell does have a meaning, and should not be removed.
Adam Danz
Adam Danz il 24 Ago 2020
Modificato: Adam Danz il 24 Ago 2020
I see. I'll update my answer.
Note that the isempty function will return the same results whether the cell is 0xn, nx0 or 0x0 but if you're using the cell size for any reason, then it matters what the empty dimensions are.

Accedi per commentare.

 Risposta accettata

Adam Danz
Adam Danz il 24 Ago 2020
Modificato: Adam Danz il 24 Ago 2020
How to remove empty cells
To remove all empty cells in the 2nd layer of a nested cell array named X,
for i = 1:numel(X)
X{i}(cellfun(@isempty,X{i})) = [];
end
Or, in 1 line,
X = cellfun(@(C){C(~cellfun(@isempty,C))},X);
That may eliminimate all of the 2nd layer of nested cells in which case some of the first layer may become empty. If you'd like to eliminate them as well (ie, all cells where all nested cells were removed),
X(cellfun(@isempty, X)) = [];
How to replace 0xn or nx0 empty cells with 0x0
To replace all 0xn or nx0 cells in the 2nd layer of a nested cell array named X,
for i = 1:numel(X)
X{i}(cellfun(@isempty,X{i})) = {[]};
end

1 Commento

I'm guessing that your workflow uses size() which is why it's a problem when a cell is 0x2. If that's the case, you could avoid this entire process if you use isempty() within your workflow instead of size(). If the size of the arrays are already stored somewhere as sz, you could use something like if any(sz==0).
Also, if the second block of code in your question resembles what you're actually doing, you could shave off some time by fixing the problem within that section rather than additing another set of loops to convert 0x2 to 0x0. This is the fastest method yet, I believe (not that it matters at this point).
% Replace the 2nd block of code in your question with this
for i=1:N
Xi = X{i};
for j=1:10
rmIdx = Xi{j}(:,1) < 3;
if all(rmIdx)
Xi{j} = [];
else
Xi{j}(rmIdx,:) = [];
end
end
X{i} = Xi;
end

Accedi per commentare.

Più risposte (1)

Bruno Luong
Bruno Luong il 24 Ago 2020
Modificato: Bruno Luong il 24 Ago 2020
I like your for-loop; you might speed up a little bit
for i=1:N
Xi = X{i};
Xi(cellfun('isempty',Xi)) = {[]}; % switch to string from Rik's remark
X{i} = Xi;
end

13 Commenti

You can replace the outer for-loop with cellfun
X = cellfun(@ReplaceEmpty, X, 'unif', 0)
function Xi = ReplaceEmpty(Xi)
Xi(cellfun('isempty',Xi)) = {[]}; % switch to string from Rik's remark
end
Adam Danz
Adam Danz il 24 Ago 2020
Modificato: Adam Danz il 24 Ago 2020
The OP's original nested loops are actually 1.99x faster than the one in your answer and 1.84x faster than the one in my answer, on average, mainly thanks to cellfun.
Each timed 1000 times, comparing the median values.
Your loops isn't really different than mine. It unpacks and repacks the cell array which adds a tiny bit more time.
AS
AS il 24 Ago 2020
Modificato: AS il 24 Ago 2020
Wait, are you saying my original method is the fasted approach? I expected somthing using cellfun to be faster, I just didn't get it to work properly without some help.
edit: some testing suggests that it isindeed quite a lot faster. I assumed that arrayfun and cellfun would speed up things, but that turns out not to be true.
Yeah, that's why I first state that I like OP's for-loop.
I'm still outthere looking for example where CELLFUN/ARRAYFUN beats FOR-LOOP.
"I expected somthing using cellfun to be faster"
I don't understand why a lot of people get this expectation from. CELLFUN/ARRAYFUN is a scam. It does provide compact code that's all.
Adam Danz
Adam Danz il 24 Ago 2020
Modificato: Adam Danz il 24 Ago 2020
"CELLFUN/ARRAYFUN is a scam" 😄
Generally vectorization is faster than loops which initially gave for-loops a bad rep. But speed has generally increased, especially with Matlab's JIT compilation. cellfun, arrayfun, etc all have internal loops anyway. Their main attraction is the reduction of lines of code and, sometimes, improved readability (certainly not always; sometimes they are very difficult to interpret). For simple operations, loops, even nested loops, are often faster.
Rik
Rik il 24 Ago 2020
Modificato: Rik il 24 Ago 2020
Though in this case the main slowdown is due to your use of the handle style, instead of the char input to cellfun:
N=100;
X = cell(N,1);for i=1:N,X{i}=cell(1,10);for j=1:10,X{i}{j}=randi(10,5,2);end,end
for i=1:N,for j=1:10,X{i}{j}(X{i}{j}(:,1)<3,:)=[];end,end
[timeit(@()cellfun_handle(X)) %42 microseconds
timeit(@()cellfun_str(X)) % 2.1 microseconds
timeit(@()for_fun(X))] % 1.5 microseconds
function out=cellfun_handle(X)
out=cellfun(@isempty, X);
end
function out=cellfun_str(X)
out=cellfun('isempty', X);
end
function out=for_fun(X)
out=false(size(X));
for n=1:numel(X)
out(n)=isempty(X);
end
end
This is the fatest according to my benchmark
for i=1:N
Xi = X{i};
for j=1:10
if isempty(Xi{j})
Xi{j} = [];
end
end
X{i} = Xi;
end
Rik
Rik il 24 Ago 2020
Modificato: Rik il 25 Ago 2020
If you look at the numbers I posted: I agree. Using a for loop is faster. The thing I pointed out there is that it isn't much faster than cellfun('isempty',X), while cellfun(@isempty,X) is a lot slower.
Adam Danz
Adam Danz il 24 Ago 2020
Modificato: Adam Danz il 24 Ago 2020
Great point, Rik!
I suppose that extra time is saved by not sorting through overloaded versions of the function. Thanks for that reminder!
@Bruno Luong, good idea adding the condition to check for empties.
Bruno Luong
Bruno Luong il 24 Ago 2020
Modificato: Bruno Luong il 24 Ago 2020
@Rik, Historically the CELLFUN has special speedy implementation for a small number of functions and they are invoked through string 'xx' and not @xx. 'isempty' is among them.
At some point TMW recommended not using string, I would though they move the special implementation for @xx syntax, obviously not. So thanks for reminding us and TMW must get to work and implement what they still left over.
AS
AS il 24 Ago 2020
Modificato: AS il 24 Ago 2020
@Bruno Luong, Would you mind explaning why defining and then using Xi = X{i}; inside the first loop speeds things up? It's more than twice as fast on my machine.
Bruno Luong
Bruno Luong il 25 Ago 2020
Modificato: Bruno Luong il 25 Ago 2020
Well very simple explanation:
with X{i}{j} you tells matlab to indexing twice with i variable then with j.
With Xi{j} only one indexing once with j since Xi is a variable. In the for-loop it makes a difference.

Accedi per commentare.

Categorie

Scopri di più su Loops and Conditional Statements in Centro assistenza e File Exchange

Prodotti

Release

R2018a

Richiesto:

AS
il 24 Ago 2020

Modificato:

il 25 Ago 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by