Search code examples
htmlxpathxidel

How to extract class names from html tag


I am trying extract 2nd class name from <span> tag.

Due xidel documentation is really poor. I can't understand how to use function filter() or contains() and match <span> tag with class name "userstatus" and extract 2nd class name.

I have this at the moment but I can't tell to XIDEL tool match span tag with class when one parameter contain word userstatus.

xidel -e http://intranet.website.com '//li[@class='status']/span[@class==match("userstatus").....

Thank you for any suggestions

<li class="status">
  <span class="userstatus offline strongfont2">
    blaa bllaa foo text
  </span>
</li>

<li class="status">
  <span class="userstatus online italicfont1">
    blaa bllaa foo text
  </span>
</li>`

I need extract class parameters of <span> tag.
I don't need text or HTML content of <span> tag.

Result look like this:

class="userstatus offline strongfont2"

class="userstatus online italicfont1"


Solution

  • If you want to find <span> elements where the class attribute value contains "userstatus" and then return the class, you can use the following XPath 1.0 expression :

    //li[@class='status']/span[contains(@class, 'userstatus')]/@class
    

    Since Xidel seems to support XPath 2.0, you can use the following expression to extract only the second CSS class from the above <span> elements :

    for $span in //li[@class='status']/span[contains(@class, 'userstatus')] 
    return tokenize($span, ' ')[2]
    

    I've never used Xidel before, but the above XPath seems to work when tested in Xidel online tester. You can also see demo of the above XPath in xpathtester.com