Search code examples
xmlparsingclojure

XML parse with clojure.data.xml. How to omit "\n " items from :content during "parse"


First I parsed XML file as

(def xtest (slurp "./resources/smallXMLTest.xml"))
(def way1 (clojure.xml/parse xtest))
(:content way1)

And do NOT have any "\n " items in :content hashmap.

But when I parse XML like this, with help of clojure.data.xml

(def way2 (clojure.data.xml/parse-str xtest))
(:content way2)

then I got "\n " strings in each non-leaf :content element in way2 var, between each couple of inner XMLElements :(

Is there a way to avoid these "\n " strings?


Solution

  • There's an undocumented option :skip-whitespace to all the parsers available in clojure.data.xml.

    (clojure.xml/parse-str whitespacey-str :skip-whitespace true)
    ;; => :content without spurious "\n"s
    

    Documentation bug is here https://clojure.atlassian.net/browse/DXML-63