Search code examples
clojurehtml-parsingenlive

Clojure (Enlive) How to use html/but (negation)


Hi All I'm trying to "parse/extract" html-data with Clojure en Enlive (any better choices ?)

I am trying to get all the ul > li tags that are *NOT part of the <nav> tag I think I should use the (html/but) function from Enlive but can't seem to make it work ?

;;test-envlive.clj

(defn get-tags [dom tag-list]
  (let [tags
         (mapv
          #(vec (html/select dom %1))
          tag-list)]
    tags))

;;Gives NO tags
(get-tags test-dom [[[(html/but :nav) :ul :> :li]]])

;;Gives ALL the LI-tags
(get-tags test-dom [[:ul :> :li]])
<!-- test.html -->
<html>
<head><title>Test page</title>  </head>
<body>
    <div>
        <nav>
            <ul>
                <li>
                    skip these navs-li
                </li>
                
            </ul>
        </nav>
        <h1>Hello World<h1>                 
        <ul><li>get only these li's</li>                
        </ul>           
    </div>  
</body></html>

Solution

  • If you had a valid xhtml, you could use XPath from sigel:

    (require '[sigel.xpath.core :as xpath])
    (let [data "<html><head><title>Test page</title></head>
                    <body><div><nav><ul><li>skip these navs-li</li></ul></nav>
                    <h1>Hello World</h1>
                    <ul><li>get only these li's</li></ul>
                    </div></body></html>"]
            (xpath/select data "//li[not(ancestor::nav)]"))