Search code examples
xmlstarletxpath-1.0

Select node using XPATH 1.0 and xmlstarlet containing specific text


From the XML below that begins as:

<?xml version="1.0" encoding="UTF-8"?><searchRetrieveResponse>
  <version>1.2</version>
  <numberOfRecords>1</numberOfRecords>
  <records>
    <record>
      <recordSchema>marcxml</recordSchema>
      <recordPacking>xml</recordPacking>
      <recordData>
        <record>
          <leader>01448cam a2200445Ia 4500</leader>
          <controlfield tag="001">9910650701858</controlfield>
          <controlfield tag="005">20181227054218.2</controlfield>
          <controlfield tag="008">930525s1941    nyu      b    001 0 eng d</controlfield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">(OCoLC)28157672</subfield>
          </datafield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">(OCoLC)ocm28157672</subfield>
          </datafield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">(EXLNZ-01ALLIANCE_NETWORK)99153881770001451</subfield>
          </datafield>
          <datafield tag="040" ind1=" " ind2=" ">
            <subfield code="a">UTS</subfield>
            <subfield code="b">eng</subfield>
            <subfield code="c">UTS</subfield>

I need to select only the text node in /searchRetrieveResponse/records/record/recordData/record/datafield[@tag="035"]/subfield[@code="a"] that contains (EXLNZ-01ALLIANCE_NETWORK) using xmlstarlet (XPATH 1.0) so the desirable output is (EXLNZ-01ALLIANCE_NETWORK)99153881770001451

I have attempted many variations of xmlstarlet sel -T -t -m '/searchRetrieveResponse/records/record/recordData/record/datafield[@tag="035"]/subfield[@code="a"][text()[contains(.,'ALLIANCE_NETWORK')]]' -v '.' but I keep returning all the 035/subfield[@code="a"] rather than just the one I want. What am I doing wrong? Thanks


Solution

  • Figured it out -- contains filter wasn't set up properly. I'm posting only because I found matching the node awkward.

    xmlstarlet sel -T -t -m '/searchRetrieveResponse/records/record/recordData/record/datafield[@tag="035"]/subfield[@code="a"][contains(text(), "ALLIANCE_NETWORK")]' -v '.'