I currently use an XML parser to extract the name of a route from a GPX (XML) file.
Each GPX files contains a single "name" tag which is what I've been extracting.
Here's the script:
#! /bin/bash
gpxpath=/mnt/gpxfiles; export gpxpath
for file in $gpxpath/*
do
filename=`ls $file`; export filenanme
gpxname=`$scripts/xmlparse.pl "$file"`
echo $filename " "$gpxname >> gpxparse.tmp
done
sort -k 2,2 gpxparse.tmp > gpxparse.out
cat gpxparse.out
And here's xmlparse.pl:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
XML::Twig->new(
twig_handlers => {
'name' => sub { print $_ ->text }
}
)->parse( <> );
Here's an example GPX file:
<?xml version="1.0" encoding="UTF-8"?>
<gpx version="1.1" creator="creator" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd" xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<metadata>
<referrer>Referrer</referrer>
<time>2019-06-17T06:02:23.000Z</time>
</metadata>
<trk>
<name>Another GPX file</name>
<trkseg>
<trkpt lon="-1.91990" lat="53.00131">
<ele>112.1</ele>
<time>2019-06-17T06:02:23.000Z</time>
</trkpt>
<trkpt lon="-1.91966" lat="53.00126">
<ele>113.6</ele>
<time>2019-06-17T06:02:25.000Z</time>
</trkpt>
<trkpt lon="-1.91962" lat="53.00125">
<ele>114.1</ele>
<time>2019-06-17T06:02:25.000Z</time>
</trkpt>
<trkpt lon="-1.91945" lat="53.00120">
<ele>115.5</ele>
<time>2019-06-17T06:02:26.000Z</time>
</trkpt>
</trkseg>
</trk>
</gpx>
I can successfully extract the name of the route using the scripts above However, I'd additionally like to extract the first co-ordinate pair in each file.
Atrack can defined by a "trk" element and within a track can be multiple segments or "trkseg". Finally, within a trkseg are multiple "trkpt" (track points).
A track point usually consists of a latitdue and longitude co-ordinate pair along with elevation and timestamp information.
I'm only looking to extract the first lat and lon within the first trkpt of the GPX file. Ideally, once the script has found the first co-ordinate pair it should exit and move onto the next file.
I've tried crafting an additional perl script
I've added an additional perl parse script using XML::Twig but it seems to stumble when there are multiple elements with duplicate names.
Using xmlstarlet to extract the "name" value and the lat and lon of the first trkpt:
xmlstarlet sel -t -v '//_:name' -o , \
-v '//_:trkpt[1]/@lat' -o , \
-v '//_:trkpt[1]/@lon' -n \
file.xml
Another GPX file,53.00131,-1.91990
In the shell script, you can parse this output with:
IFS=, read -r gpxname lat long < <( xmlstarlet ... )