Search code examples
haskellstatehxtarrow-abstraction

Counting and filtering Arrow for HXT


I'm trying to parse an XML, but I want to filter and extract only a determinate number of children from a given node. For example:

<root>
    <node id="a" />
    <node id="b" />
    <node id="c" />
    <node id="d" />
</root>

And then if I execute the arrow getChildren >>> myFilter 2, I would get back only the nodes with id "a" and "b".

Intuition gives that I should use a State Arrow to keep track, but I don't know how to do that.

I tried to do it myself but it's not exactly what I want, doesn't look very elegant, and doesn't work. I try to run my chain of arrows with runSLA and a integer parameter as initial state, and then defining:

takeOnly :: IOSLA Int XmlTree XmlTree
takeOnly = changeState (\s b -> s-1)
             >>> accessState (\s b -> if s >= 0 then b else Nothing)

But of course I can't return Nothing, I need to return a XmlTree. But I don't want to return anything at all!

There's probably a better way out there. Can you help me?

Thanks for your time and help!


Solution

  • It would probably be more idiomatic to use the combinators in Control.Arrow.ArrowList to handle this kind of thing.

    The package specifically provides (>>.) :: a b c -> ([c] -> [d]) -> a b d, which is a "combinator for converting the result of a list arrow into another list". This allows us to use the take function that we already have for lists in this context.

    Here's a quick version of how you might use it:

    module Main where
    
    import Text.XML.HXT.Arrow
    
    takeOnly :: (ArrowXml a) => Int -> a XmlTree XmlTree
    takeOnly n = getChildren >>. take n 
    
    main = do
      let xml = "<root><node id='a' /><node id='b' />\
                      \<node id='c' /><node id='d' /></root>"
    
      print =<< runX (readString [] xml >>> getChildren >>> takeOnly 2)
    

    This I believe does approximately what you're looking for:

    travis@sidmouth% ./ArrowTake
    [NTree (XTag (LP node) [NTree (XAttr (LP id)) [NTree (XText "a") []]]) [],
     NTree (XTag (LP node) [NTree (XAttr (LP id)) [NTree (XText "b") []]]) []]
    

    No IOSLA required. Note that I've also changed the function type a little—this version seems nicer to me, but you could easily convert it to something more like the type in your version.