Search code examples
htmlxsltxslt-1.0

What XSLT allows me to skip <li> tags that contain <a> tags with certain href values?


This is a follow-on to my previous question.

I can't quite work out the XSLT to do the following. I have some HTML with one or more <ul> tags. The <li> tags may contain <a> tags. I want to remove any <li> tag if it contains an anchor where the href meets a certain pattern.

Example:

<ul>
  <li><a href="/some/old/path">One</a></li>
  <li><a href="/other/old/path">Two</a></li>
  <li><a href="/some/older/path">Three</a></li>
  <li><a href="/other/older/path">Four</a></li>
</ul>

I wish to remove the <li> lines where the href contains older so the result would be:

<ul>
  <li><a href="/some/old/path">One</a></li>
  <li><a href="/other/old/path">Two</a></li>
</ul>

The lines I wish to remove could be in any order and scattered across multiple <ul> tags. I'm fine if I end up with an empty <ul></ul> pair (but bonus points if such a resulting empty list can be removed easily). <li> tags that do not contain an anchor or that contain a non-matching anchor should be left as-is.

I got close with the following:

<xsl:template match="li/a[contains(@href, 'older')]">
</xsl:template>

but this leaves the opening <li>:

<ul>
  <li><a href="/some/old/path">One</a></li>
  <li><a href="/other/old/path">Two</a></li>
  <li>
  <li>
</ul>

How do I get rid of the whole <li> line?

Here's the full HTML I'm working with:

<html>
<head>
<!-- lots of stuff I don't care about -->
</head>
<body>
<div>
  <!-- lots of stuff I don't care about -->
  <div>
     <!-- lots of stuff I don't care about -->
     <div id="key_div">
         <div id="ignore_this">
           <!-- lots of stuff I don't care about -->
         </div>
         <p>More junk I don't want</p>
         <p>Even more junk I don't want</p>
         <h2><span class="someClass" id="someID">Header</span></h2>
         <p>Stuff I want to keep</p>
         <!-- A lot of stuff I want to keep -->
         <p>More stuff I want to keep</p>
         <ul>
           <li><a href="/some/old/path">One</a></li>
           <li><a href="/some/old/other">Two</a></li>
           <li><a href="/some/older/path">Three</a></li>
           <li><a href="/some/older/other">Four</a></li>
         </ul>
         <ul>
           <li>Leave this as-is</li>
         </ul>
     </div>
     <!-- lots of stuff I don't care about -->
  </div>
  <!-- lots of stuff I don't care about -->
</div>
</body>
</html>

And here's the XSLT:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="html" indent="yes" encoding="utf-8"/>

    <xsl:template match="/html">
        <html>
            <head>
                <title></title>
            </head>
            <body>
                <xsl:apply-templates select="//div[@id='key_div']/h2"/>
            </body>
        </html>
    </xsl:template>

    <xsl:template match="h2">
        <h1>
            <xsl:value-of select="." />
        </h1>
        <xsl:apply-templates select="following-sibling::*"/>
    </xsl:template>

    <!-- My failed attempt to remove certain li lines -->
    <xsl:template match="li/a[contains(@href, 'older')]">
    </xsl:template>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

My current result:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title></title>
</head>
<body>
<h1>Header</h1>
<p>Stuff I want to keep</p>
<p>More stuff I want to keep</p>
<ul>
           <li><a href="/some/old/path">One</a></li>
           <li><a href="/some/old/other">Two</a></li>
           <li>
           <li>
         </ul>
<ul>
           <li>Leave this as-is</li>
         </ul>
</body>
</html>

I just need to figure out how to remove the full <li> line for the matching hrefs.


Solution

  • With:

    match="li/a[contains(@href, 'older')]"
    

    you’re selecting the a element.

    Try changing it to:

    match="li[contains(a/@href, 'older')]"
    

    (Untested and honestly I didn’t even look at your full XSLT.)