I'm trying to use xmlstarlet to extract the text in certain elements in this XML feed:
https://services.boatwizard.com/bridge/events/bc0af0c8-4b47-42b3-9a71-5326775344e0/boats?status=on
One of the elements I'd like to extract is the text for city name which is embedded in the XML document as (excluding certain parent elements for clarity):
<Location>
<LocationAddress>
<CityName>St Malo</CityName>
<CountryID>FR</CountryID>
<Postcode>35400</Postcode>
</LocationAddress>
</Location>
I'm trying to extract "St Malo".
I've saved the feed to boats.xml
and I've used xmlstarlet el -v boats.xml
to figure out the correct XPath name, which seems to be:
ProcessVehicleRemarketingDataArea/VehicleRemarketing/VehicleRemarketingBoatLineItem/Location/LocationAddress/CityName
I am trying the following syntax to extract the text:
xml sel -t -m "ProcessVehicleRemarketingDataArea/VehicleRemarketing/VehicleRemarketingBoatLineItem/Location/LocationAddress/CityName" -v "." -n boats.xml
Have tried many different syntax variations with no success. Almost think it might be the XML file that's off? How I extract "St Malo"?
The XML in the link you supplied declares a default namespace in the VehicleRemarking
tag:
<VehicleRemarketing xmlns="http://www.starstandard.org/STAR/5" ...>
That means you have to declare it with a prefix which you should use to qualify each step of your XPath expression that is part of that namespace:
xml sel -N ns=http://www.starstandard.org/STAR/5
-t -m "ProcessVehicleRemarketingDataArea/ns:VehicleRemarketing//ns:CityName"
-v "." -n boats.xml
The first element is not part of the namespace, but ns:VehicleRemarketing
and all its children are. You can also use just //ns:CityName
for the expression, in this case (considering the example you posted - it will return all CityName
elements in the file).