Search code examples
bashawkgreppcregrep

Multiline grep with specific text


There is an xml file with lot of <A_tag>-s in it.

I need to see those A tags (and their children, so the tags' whole content) that have at least one <C_tag>.

So this block should match (therefore contained in the result):

<A_tag>
    ...
    ...
    <C_tag attr1="" ... attrn="" />
    ...
</A_tag>

I tried using pcregrep, but I don't know how to tell any block ending, that is longer than 1 character (and </A_tag> is longer than that, but for instance [^>] regexp would be easy for me too).

I also tried awk, but couldn't manage the goal with it either.

If someone experienced would help me, please make your command separate the found blocks with an empty line too, with that I could learn more.


Solution

  • Following up on the xmllint comment:

    xmllint --xpath '(//A_tag/C_tag/..)' x.xml
    

    Will look for C_TAG under A_TAG, and then display the parent A_TAG.

    Output:

    <A_tag>
        <C_tag attr1="" attrn=""/>
    </A_tag>