Working with very big data faster ?
Mostra commenti meno recenti
Dear Matlab users,
I have to deal with very big data(Point clouds generally more than 30 000 000 points) using Matlab. I can read ascii data using "textscan" function. After reading, I need to detect invalid data (points with 0,0,0 coordinates) and then I need to do some mathematical operations on each point or each line in the data. In my way, first I read data with "testscan" and then I assign this data to a matrix. Secondly, I use for loops for detecting invalid points and doing some mathematical operations on each point or line in the data. A sample of my code is shown as below. Is there a way of avoiding for loops or what is the best way of speeding up this computation? I am looking forward to hearing from you
fileID = fopen('some ascii data with more than 10 000 000 points');
original_data = textscan(fileID,'%f %f %f %f %f %f %f', 'delimiter',' ');
fclose(fileID);
column = original_data{1}(1);
row = original_data{1}(2);
t_matrix = [original_data{1}(7) original_data{2}(7) original_data{3}(7) original_data{4}(7)
original_data{1}(8) original_data{2}(8) original_data{3}(8) original_data{4}(8)
original_data{1}(9) original_data{2}(9) original_data{3}(9) original_data{4}(9)
original_data{1}(10) original_data{2}(10) original_data{3}(10) original_data{4}(10)];
coordinate_list(:,1) = original_data{1}(11:length(original_data{1}));
coordinate_list(:,2) = original_data{2}(11:length(original_data{2}));
coordinate_list(:,3) = original_data{3}(11:length(original_data{3}));
coordinate_list(:,4) = 0;
coordinate_list(:,5) = original_data{4}(11:length(original_data{4}));
%detect invalid points and transform each point with t_matrix
for i = 1:length(coordinate_list)
if coordinate_list(i,1) == 0 && coordinate_list(i,2) == 0 && coordinate_list(i,3) == 0
transformed_list(i,:) = NaN;
else
%transformed_list(i,:) = coordinate_list(i,:)*t_matrix;
transformed_list((i:i),(1:4)) = coordinate_list((i:i),(1:4))*t_matrix;
transformed_list(i,5) = coordinate_list(i,5);
end
i
end
6 Commenti
KSSV
il 26 Set 2016
You have not initialized transformed_list()...this makes codes slow. You must considering initializing.
Adam
il 26 Set 2016
Have you run the profiler on your code?
doc profile
You should always do this before making any attempt at speeding up your code, otherwise how do you know which part is taking the longest time? Assumptions are generally a very bad idea!
mustafa ozendi
il 26 Set 2016
KSSV
il 26 Set 2016
does your text file have any texts inside? or only numbers? Can you attach a sample of the text file?
mustafa ozendi
il 26 Set 2016
per isakson
il 26 Set 2016
Modificato: per isakson
il 26 Set 2016
Use
textscan( ..., 'CollectOutput',true )
Neither of your two samples matches
textscan(fileID,'%f %f %f %f %f %f %f', 'delimiter',' ');
Risposte (1)
To find whether (x,y,z) are zeros, you need not to run a loop. You can find in single stretch.
id = sum(coordinate_list,2)==0 ; % this output will be logical
idx = find(sum(coordinate_list,2)==0) ; % this output will give positions where are zeros
You can achieve all the loop things with out using for loop.
Categorie
Scopri di più su Large Files and Big Data in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!