# Efficiently populating an array without for loops

50 visualizzazioni (ultimi 30 giorni)
Rachel il 27 Mag 2012
Hi Everyone,
I have a list of data with 10,000,000 rows and 3 columns. The columns correspond to the shape, size, and color of an object, which is indexed with a number. There are 100 shapes, 100 sizes, and 50 colors.
I want to create a matrix (100x100x50) that essentially stores the count of each object type, kind of like a histogram for unique objects.
Rather than my following code, which is too slow to run because of the for-loops, does anyone know of a way to complete the same operation using direct matrix operations? It seems these comparisons should be relatively fast, but are extremely slow in Matlab the way I am doing it.
ObjectTypes = zeros(100,100,50);
for Shape=1:100
for Size=1:100
for Color=1:50
ObjectTypes(Shape,Size,Color) = size(MyData(MyData(:,1) == Shape & MyData(:,2) == Size & MyData(:,3) == Color),1);
end
end
end
##### 0 CommentiMostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Accedi per commentare.

### Risposta accettata

Geoff il 27 Mag 2012
Hah... So an alternative in Order(N) time...
for n = 1:size(MyData,1)
row = MyData(n, [1,2,3]);
ObjectTypes(row(1),row(2),row(3)) = ObjectTypes(row(1),row(2),row(3)) + 1;
end
##### 1 CommentoMostra NessunoNascondi Nessuno
Rachel il 27 Mag 2012
This worked very well and quickly ... thanks so much for your help!

Accedi per commentare.

### Più risposte (2)

Geoff il 27 Mag 2012
Yeah that's searching through your data an awful lot every time you do the == comparisons. The way I do this kind of thing when populating a matrix from database results is to have the data sorted by two variables, and then use diff and find to get the data ranges.
MyData = sortrows(MyData);
Grab out the begin and end index for each group of values in column one.
% Partition by shape
begin1 = [1; 1+find(diff(MyData(:,1)))];
end1 = [begin1(2:end)-1; size(MyData,1)];
Now you can combine these into a loop variable, so each time through the loop will give you a 2x1 vector containing the start and end range. You do the same thing again with column 2. Finally I use accumarray to count up all the colours for a given size and shape:
% Process the Shape partitions
for r1 = [begin1, end1]'
Shape = MyData(r1(1), 1); % Single Shape
% Partition by Size
idx1 = r1(1):r1(2);
col2 = MyData(idx, 2);
begin2 = [1; 1+find(diff(col2))];
end2 = [begin2(2:end)-1; numel(col2)];
% Process the Size partitions
for r2 = [begin2, end2]'
Size = col2(r2(1)); % Single Size
idx2 = r1(1)+r2(1):r1(1)+r2(2);
% Count up all the Color occurrences for Shape and Size
Color = MyData(idx2, 3);
colorCount = accumarray(Color, ones(numel(Color),1));
ObjectTypes(Shape, Size, 1:max(Color)) = colorCount;
end
end
I would hope this is faster than your current loop, although there are probably clever ways to use accumarray without all the looping guff I've done. Apologies if there are errors in this code. I just hacked it straight into my web browser =)
##### 1 CommentoMostra NessunoNascondi Nessuno
Geoff il 27 Mag 2012
Made a couple of edits in the inner loop to fix a couple of obvious mistakes.

Accedi per commentare.

Walter Roberson il 27 Mag 2012
Are the numbers for the shape, size, color consecutive integers each starting from 1? If they are then the code can be reduced to
ObjectTypes = accumarray(MyData, 1);
If not then you can create the consecutive integers by using the thiree-output version of unique().
[ushape, junk, shapeidx] = unique(MyData(:,1));
[ucol, junk, colidx] = unique(MyData(:,2));
[usize, junk, sizidx] = unique(MyData(:,3));
ObjectTypes = accumarray( [shapeid(:), colidx(:), sizidx(:)], 1);
##### 0 CommentiMostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Accedi per commentare.

### Categorie

Scopri di più su Loops and Conditional Statements in Help Center e File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!