Search code examples
xmlstarlet

Use xmlstarlet to select nodes that do NOT contain a specific subnode


I have thousands of records similar to the one below

<holding>
  <holding_id>2225031160001858</holding_id>
  <record>
    <leader>00210cx a22200085 454500</leader>
    <controlfield tag="001">h38165-01alliance_ohsu</controlfield>
    <controlfield tag="004">b10145746-01alliance_ohsu</controlfield>
    <controlfield tag="005">20200417125900.0</controlfield>
    <controlfield tag="008">2004170u\\\\0\\\0001aaund0999999</controlfield>
    <datafield ind1="2" ind2=" " tag="852">
      <subfield code="b">OHSUMAIN</subfield>
      <subfield code="c">oldstorjrl</subfield>
    </datafield>
  </record>
</holding>

I need to change datafield @ind1 to " " where @tag="852" AND no subfield with @code="h" exists. In this example, @code="b" and @code="c" exist, but @code="h" does not, so I'd want to modify this record.

I can think of ways to accomplish what I need using program logic, but can I use xmlstarlet directly to select the nodes I want based on the absence of a subnode?

Desired output from this record would be

<holding>
  <holding_id>2225031160001858</holding_id>
  <record>
    <leader>00210cx a22200085 454500</leader>
    <controlfield tag="001">h38165-01alliance_ohsu</controlfield>
    <controlfield tag="004">b10145746-01alliance_ohsu</controlfield>
    <controlfield tag="005">20200417125900.0</controlfield>
    <controlfield tag="008">2004170u\\\\0\\\0001aaund0999999</controlfield>
    <datafield ind1=" " ind2=" " tag="852">
      <subfield code="b">OHSUMAIN</subfield>
      <subfield code="c">oldstorjrl</subfield>
    </datafield>
  </record>
</holding>

Solution

  • Not sure how I missed this, but it turned out to be straightforward

    xmlstarlet ed -u '/holding/record/datafield[@tag="852"][not(subfield[@code="h"])]/@ind1' -v ' '