How can I speed up my code?

Hi! I wonder if it is possible to speed up my code. It's working properly and I have had plenty of help from your experts. So, a big thank you for that!
At the current data input (100 text files) it takes about 1 min and 7 sec to run.
Is it possible to make some adjustments to speed it up a bit? Here's my code and be please be kind I'm really a beginner at programming:
clear all
close all
addpath \\winfs...this is where I get my input textfiles...\months;
disp('Välkommen till detta verktyg för METAR-insamling!')
promt='Ange önskad flygplats ICAO-kod (ex ESSA):';
a=input(promt,'s');
disp('Data finns från 2008-07-01 00:20Z')
promt='Ange önskad starttid YYYYMM (ex 200807):';
b=input(promt,'s');
promt='Ange önskad sluttid YYYYMM (ex 201610):';
c=input(promt,'s');
disp('Bra val! Data laddas, detta kan ta upp till 1 min...')
formatIn = 'yyyymm';
Startnum=datenum(b,formatIn);
Slutnum=datenum(c,formatIn);
indata=[Startnum:Slutnum];
dvec=datevec(indata);
duniq = unique(dvec(:, 1:2), 'rows');
duniqyear = unique(duniq(:, 1), 'rows');
result = datenum(duniq(:,1), duniq(:,2), 1);
w=datestr(result,'yyyymm');
numfiles = length(w);
Data = cell(1, numfiles);
for h=1:numfiles;
filename=sprintf('%s.txt',w(h,:));
fileID=fopen(filename);
A=fread(fileID,'*char');
fclose(fileID);
B=A';
Data{h} = strread(B,'%s','delimiter','\n\b');
end
Utcell = cell(1,length(Data));
for l = 1:length(Data);
s = strfind(Data{l},a);
empty=zeros(1,length(Data{l}))';
j=0;
for k = 1:length(Data{l})
ind = find(s{k});
if ind==1;
empty(k)=j+1;
end
end
metar = find(empty);
Nydata = Data{l};
Utcell{l} = Nydata(metar);
end
Utmetar=cell(1,length(numfiles));
for n=1:length(w)
filename=sprintf('%s.txt',w(n,:));
YYYYMM = regexprep(filename, '.txt','');
flygplatsYYYYMM=Utcell{n};
yearmonth = YYYYMM;
Utmetar{n}=regexprep(flygplatsYYYYMM, '\d{6}Z', sprintf('%s$0', yearmonth));
end
disp('Var god vänta, sparar data som .xlsx-fil...')
UtmetarCell = {cat(1,Utmetar{:})};
UtmetarCat=UtmetarCell{:};
%XLSX-fil
delete('*.xlsx');
b=sprintf('%s.xlsx',a);
%xlswrite(b,UtmetarCat);
yearstring = regexp(UtmetarCat, '(?<= )\d{4}(?=\d+Z)', 'match', 'once'); %extract year as string
%assert(~any(cellfun(@isempty, yearstring)), 'Failed to find year in some string');
[duniqyear, ~, idx] = unique(str2double(yearstring)); %convert to numeric, get unique values and corresponding index in A
NaNduniqyear=isnan(duniqyear); %Letar efter NaN
Nanrows=length(find(NaNduniqyear)); %Hittar hur många NaNrader som finns
Asplit = accumarray(idx, 1:numel(UtmetarCat), [], @(indices) {UtmetarCat(indices)}); %distribute identical years in Dest
for f=1:length(duniqyear)-Nanrows
SaveMet=Asplit{f};
warning( 'off', 'MATLAB:xlswrite:AddSheet' ) ;
xlswrite(b,SaveMet,f)
end
disp('Klart!')
I suppose it's the multiple 'for loops' that takes up a lot of computing time. Any help is much appreciated!

10 Commenti

Linus Dock
Linus Dock il 13 Ott 2016
This is one of the input files.
Using the profiler could be a nice tool that helps to identify what takes time: https://uk.mathworks.com/help/matlab/matlab_prog/profiling-for-improving-performance.html
Linus Dock
Linus Dock il 13 Ott 2016
Thank you! I used the profiler and it seems that the 'xlswrite' function is the main culprit but I need it unfortunately. I also wonder if there is a way to shorten the code and reduce the number of 'for loops'?
Adam
Adam il 13 Ott 2016
What % of the time is spent in xlswrite? Because if it is something like 80% then there is very little point trying to speed up for loops. They would represent at most 20% of your total time so any gain would only be in that proportion. Even if you speed those up 10-fold they would only reduce your total time by 18% in that case.
Swarooph
Swarooph il 13 Ott 2016
Pardon me, but this is not a specific answer. Take a look at the following webinar. I found it really helpful to get a big picture view of concepts you could use in MATLAB to improve and optimize your code for performance.
Linus Dock
Linus Dock il 13 Ott 2016
Great thank you, I will take a look at this.
Mostafa
Mostafa il 13 Ott 2016
Modificato: Mostafa il 13 Ott 2016
You can check this Improved xlswrite to accelerate using xlswrite.
Guillaume
Guillaume il 13 Ott 2016
Modificato: Guillaume il 13 Ott 2016
To help people understand what your code is doing, it would be great if you:
a) used meaningful variable names (so that for example when we come to s = strfind(Data{l},a), we don't have to search back from the beginning to find what was a)
b) wrote a comment at the beginning of each loop explaining what its purpose is
Also, what version of matlab are you using? You're using some functions that have been deprecated for years (e.g. strread). The version of matlab is also relevant for xlswrite. There has been some improvements in recent versions to speed up repeated xlswrite calls.
Finally, I believe I said that before, do not use empty as variable name. It's already an important matlab function.
Thanks for all your help! I have downloaded the improved xlswrite function but now I get this error:
??? Error using ==> xlswrite at 219
Attempt to reference field of non-structure array.
Error in ==> Metartest at 84
xlswrite(b,Asplit{f},f)
The xlswrite.m file is in my working directory. Any suggestions?
Linus Dock
Linus Dock il 14 Ott 2016
I'm using Matlab 7.12.0 (R2011a) so yes it is quite old :)

Accedi per commentare.

 Risposta accettata

dbmn
dbmn il 13 Ott 2016
Just a few basic tips, there is tons of reading material available online on that topic
  • The profiler is always a good idea to start. You can use it with the following code and it should help you identify the bottlenecks of your code.
profile on
% here comes your code (maybe without the clear all)
profile viewer
  • Then you should try to avoid loops (especially stacked loops). Either by Vectorization or Matlab Built ins like arrayfun, cellfun, structfun etc.I assume that the following two statements can be replaced by a cellfun.
for l = 1:length(Data); %and
for k = 1:length(Data{l})
  • You could avoid creating unnecessary variables like
SaveMet=Asplit{f};
xlswrite(b,SaveMet,f)
and simply use
xlswrite(b,Asplit{f},f)

1 Commento

Guillaume
Guillaume il 13 Ott 2016
Modificato: Guillaume il 13 Ott 2016
Yes, avoiding loop is a good idea (although matlab has improved in this respect, so may not be as critical). Vectorised operations can bring great speed up. However, it's unlikely that replacing a loop with cellfun, arrayfun and friends is going to make it faster. If anything due to the cost of extra function call (particularly if you use anonymous functions), it can actually be slower. However, you do gain in clarity and code quality (in my opinion).
Creating temporary variables should not be a problem, since until they are modified, they're just a pointer to the original variable.

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Loops and Conditional Statements in Centro assistenza e File Exchange

Richiesto:

il 13 Ott 2016

Commentato:

il 14 Ott 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by