Search code examples
scalascala-xml

Filtering XML elements with null attributes


I'm trying to extract attributes from a regular XML structure; it seems natural to first exclude the elements for which a particular attribute is missing.

I don't know why the following doesn't work (see answer for why I ever got the idea to test vs. null):

val test = <top><el attr="1"></el><el></el><el attr="2"></el></top>
test.child.filter(_ \ "@attr" != null).map(_ \ "@attr")
// ArrayBuffer(1, NodeSeq(), 2)

Why is the middle element still there after the filter?

I've confirmed it's not operator precedence:

test.child.filter(x => (x \ "@attr") != null).map(_ \ "@attr")
// ArrayBuffer(1, NodeSeq(), 2)

Alternatively (assuming this is optimized internally), how could I exclude the NodeSeq() elements after the map step?


Solution

  • Just figured this out. filter wasn't return null, but NodeSeq(), so that the following works:

    test.child.filter(_ \ "@attr" != scala.xml.NodeSeq.Empty).map(_ \ "@attr")
    // ArrayBuffer(1, 2)
    

    Followed this Q&A to discover how to create the NodeSeq() object by hand


    I discovered my problem ultimately derived from crossing my own wires. I initially had been using the following:

    test.child.map(_.attributes("attr"))
    // ArrayBuffer(1, null, 2)
    

    Which is where I got the idea to test vs. null originally. Of course, if I had stuck with that, my initial approach would have worked:

    test.child.filter(_.attributes("attr") != null).map(_ \ "@attr")
    // ArrayBuffer(1, 2)