xml2struct is very slow...

9 views (last 30 days)
Robin Schäfer
Robin Schäfer on 5 Mar 2020
Answered: Pierre Harouimi on 1 Dec 2021
Hello community,
I use xmlread() and xml2struct() to read latitude,longitude and time vector from an .gpx-file (which is formated as xml). This solution appears to be very, very slow. xml2struct >getNodeData takes over 90 % of the time of my whole script according to the profiler. Here is an example of the call in my function:
DOM=xmlread(file);
GPX=xml2struct(DOM);
time=strings(size(GPX.gpx.trk.trkseg.trkpt))'; % preallocation
lat=strings(size(time));
lon=strings(size(time));
for b=1:length(GPX.gpx.trk.trkseg.trkpt) % extract parameters
time(b)=GPX.gpx.trk.trkseg.trkpt{1,b}.time.Text;
lat(b)=GPX.gpx.trk.trkseg.trkpt{1,b}.Attributes.lat;
lon(b)=GPX.gpx.trk.trkseg.trkpt{1,b}.Attributes.lon;
end
lat=str2double(lat);
lon=str2double(lon);
time=extractAfter(time,'T');
time=extractBefore(time,'Z');
% the time will be formatted later on
The main problem on speed is not the for-loop but the xml2struct() command.
Here is an example of my data opened in texteditor:
<?xml version="1.0" encoding="UTF-8"?>
<gpx xmlns="http://www.topografix.com/GPX/1/1" version="1.1" creator="Polar Pro">
<metadata><author><name>Polar</name></author><time>2020-01-29T16:16:08.000Z</time></metadata>
<trk><trkseg>
<trkpt lat="-33.94066317" lon="18.86733817"><ele>0.0</ele><time>2020-01-29T16:16:10.000Z</time></trkpt>
<trkpt lat="-33.940 ….
Any suggestions to improve the speed? It does really matter, the function is going to be called several times...
EDIT 17.03.2020: Attached file as an example
  1 Comment
Robin Schäfer
Robin Schäfer on 17 Mar 2020
-UP-
I still need some speed improvement. Can anybody help?
I also tried to read data by this code, adapted from another question on this page. However, i ain't got it working for time and speed increase was not apparent...
DOM=xmlread(file);
x_root = DOM.getFirstChild;
trkpt_nodes = x_root.getElementsByTagName('trkpt');
time_nodes = x_root.getElementsByTagName('time');
lon=NaN(trkpt_nodes.getLength);
lat=NaN(size(lon));
time=NaN(size(lon));
for i = 0 : trkpt_nodes.getLength - 1
trkpt_element = trkpt_nodes.item(i);
lat(i+1)=trkpt_element.getAttribute('lat');
lon(i+1)=trkpt_element.getAttribute('lon');
% time extraction missing
end

Sign in to comment.

Answers (1)

Pierre Harouimi
Pierre Harouimi on 1 Dec 2021
Long time after... But hope this will help
You can use the readstruct function (R2020b):
GPX = readstruct("Example.xml");
time = [GPX.trk.trkseg.trkpt.time]';
lat = [GPX.trk.trkseg.trkpt.latAttribute]';
lon = [GPX.trk.trkseg.trkpt.lonAttribute]';
time = datetime(time,'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SSS''Z');
[h,m,s] = hms(time);
Bonus: directly use the gpxread (Mappping Toolbox)

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by