Search code examples
xmlstarlet

xmlstarlet "does not work" for XMLs with namespaces


I'm using media info, to get some xml information about movie:

mediainfo --Output=XML Krtek\ a\ buldozer-jdvwqZUEbhc.mkv  | xmlstarlet format

which output is:

<?xml version="1.0" encoding="UTF-8"?>
<MediaInfo xmlns="https://mediaarea.net/mediainfo" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://mediaarea.net/mediainfo https://mediaarea.net/mediainfo/mediainfo_2_0.xsd" version="2.0">
  <creatingLibrary version="18.03" url="https://mediaarea.net/MediaInfo">MediaInfoLib</creatingLibrary>
  <media ref="Krtek a buldozer-jdvwqZUEbhc.mkv">
    <track type="General">
      <UniqueID>101120522676894244607292274887483611459</UniqueID>
      <VideoCount>1</VideoCount>
      <AudioCount>1</AudioCount>
      <FileExtension>mkv</FileExtension>
      <Format>Matroska</Format>
      <Format_Version>4</Format_Version>
      <FileSize>60132643</FileSize>
      <Duration>374.101</Duration>
      <OverallBitRate>1285912</OverallBitRate>
      <FrameRate>25.000</FrameRate>
      <FrameCount>9352</FrameCount>
      <IsStreamable>Yes</IsStreamable>
      <File_Modified_Date>UTC 2018-10-15 07:09:29</File_Modified_Date>
      <File_Modified_Date_Local>2018-10-15 09:09:29</File_Modified_Date_Local>
      <Encoded_Application>Lavf57.71.100</Encoded_Application>
      <Encoded_Library>Lavf57.71.100</Encoded_Library>
      <extra>
        <ErrorDetectionType>Per level 1</ErrorDetectionType>
      </extra>
    </track>
    <track type="Video">
      <StreamOrder>0</StreamOrder>
      <ID>1</ID>
      <UniqueID>1</UniqueID>
      <Format>AVC</Format>
      <Format_Profile>High</Format_Profile>
      <Format_Level>4</Format_Level>
      <Format_Settings_CABAC>Yes</Format_Settings_CABAC>
      <Format_Settings_RefFrames>3</Format_Settings_RefFrames>
      <CodecID>V_MPEG4/ISO/AVC</CodecID>
      <Duration>374.080000000</Duration>
      <Width>1920</Width>
      <Height>1080</Height>
      <Stored_Height>1088</Stored_Height>
      <Sampled_Width>1920</Sampled_Width>
      <Sampled_Height>1080</Sampled_Height>
      <PixelAspectRatio>1.000</PixelAspectRatio>
      <DisplayAspectRatio>1.778</DisplayAspectRatio>
      <FrameRate_Mode>CFR</FrameRate_Mode>
      <FrameRate_Mode_Original>VFR</FrameRate_Mode_Original>
      <FrameRate>25.000</FrameRate>
      <FrameCount>9352</FrameCount>
      <ColorSpace>YUV</ColorSpace>
      <ChromaSubsampling>4:2:0</ChromaSubsampling>
      <BitDepth>8</BitDepth>
      <ScanType>Progressive</ScanType>
      <Delay>0.000</Delay>
      <Default>Yes</Default>
      <Forced>No</Forced>
      <colour_range>Limited</colour_range>
      <colour_description_present>Yes</colour_description_present>
      <colour_primaries>BT.709</colour_primaries>
      <transfer_characteristics>BT.709</transfer_characteristics>
      <matrix_coefficients>BT.709</matrix_coefficients>
    </track>
    <track type="Audio">
      <StreamOrder>1</StreamOrder>
      <ID>2</ID>
      <UniqueID>2</UniqueID>
      <Format>Opus</Format>
      <CodecID>A_OPUS</CodecID>
      <Duration>374.101000000</Duration>
      <Channels>2</Channels>
      <ChannelPositions>Front: L R</ChannelPositions>
      <SamplingRate>48000</SamplingRate>
      <SamplingCount>17956848</SamplingCount>
      <BitDepth>32</BitDepth>
      <Compression_Mode>Lossy</Compression_Mode>
      <Delay>0.000</Delay>
      <Delay_Source>Container</Delay_Source>
      <Language>en</Language>
      <Default>Yes</Default>
      <Forced>No</Forced>
    </track>
  </media>
</MediaInfo>

now say that I want to get all IDs:

... | xmlstarlet sel -t -v "//ID"

and nothing is printed. What? Why? Well it turned out, that if i remove all parameters from tag on second line, the same selection command will work. Now I undestand, that xmlstarlet (probably) works just fine, I'm just missing some magic flag or syntax, so that it can process xmls with defined namespaces. Can someone advice?


Solution

  • You need to use the namespace with -N option, and use it in the query like <namespace>:<xpath>:

    ... | xmlstarlet sel -N n="https://mediaarea.net/mediainfo" -t -v "//n:ID" 
    

    From the help page:

    -N <name>=<value>
    - predefine namespaces (name without 'xmlns:')
    ex: xsql=urn:oracle-xsql
    Multiple -N options are allowed.