Hi All I'm trying to "parse/extract" html-data with Clojure en Enlive (any better choices ?)
I am trying to get all the ul > li
tags that are *NOT part of the
<nav>
tag I think I should use the (html/but) function from Enlive
but can't seem to make it work ?
;;test-envlive.clj
(defn get-tags [dom tag-list]
(let [tags
(mapv
#(vec (html/select dom %1))
tag-list)]
tags))
;;Gives NO tags
(get-tags test-dom [[[(html/but :nav) :ul :> :li]]])
;;Gives ALL the LI-tags
(get-tags test-dom [[:ul :> :li]])
<!-- test.html -->
<html>
<head><title>Test page</title> </head>
<body>
<div>
<nav>
<ul>
<li>
skip these navs-li
</li>
</ul>
</nav>
<h1>Hello World<h1>
<ul><li>get only these li's</li>
</ul>
</div>
</body></html>
If you had a valid xhtml, you could use XPath from sigel:
(require '[sigel.xpath.core :as xpath])
(let [data "<html><head><title>Test page</title></head>
<body><div><nav><ul><li>skip these navs-li</li></ul></nav>
<h1>Hello World</h1>
<ul><li>get only these li's</li></ul>
</div></body></html>"]
(xpath/select data "//li[not(ancestor::nav)]"))