Search code examples
haskellhxt

HXT skip element on error


I am iterating over list of tables and inside a proc I am using function which may throw an exception (I can't change its behaviour):

parseThing :: ArrowXml a => a XmlTree Thing
parseThing = deep (hasName "table") &&& deep (hasName "h3") >>> proc (table, h3) -> do
  name <- getText <<< getChildren -< h3
  info -< listA parseTable -< table

  returnA $ mkThing name info -- may throw an exception

where parseTable :: ArrowXml a => a XmlTree (String, String) parses <tr><td>key</td><td>value</td></tr> into a (key, value) tuple and mkThing :: String -> [(String, String)] -> Thing builds Thing by name and info and may throw an exception (it depends on both name and info)

How can I skip processing the thing which cannot be build? If I leave exception uncaught it converts to string somehow and walks into output.


Solution

  • You can use catchA arrow to handle exceptions inside the pipeline. In that case parseThing needs to be of more specific IO-aware type like IOSArrow.

    To skip exceptions, catchA is given none arrow as an exception handler.

    Here is a full example (I extracted parseThing_aux from parseThing for better readability):

    {-# LANGUAGE Arrows #-}
    import Control.Arrow.ArrowExc (catchA)
    import Text.XML.HXT.Core
    
    
    data Thing = Thing String [(String,String)] deriving Show
    
    
    main :: IO ()
    main = do
      xml <- getContents
      res <- runX $ readString [] xml >>> parseThing
      mapM_ print res
    
    
    parseThing :: IOSArrow XmlTree Thing
    parseThing
      = deep (hasName "table") &&& deep (hasName "h3")
      >>> catchA parseThing_aux none
    
    
    parseThing_aux :: IOSArrow (XmlTree, XmlTree) Thing
    parseThing_aux = proc (table, h3) -> do
        name <- getText <<< getChildren -< h3
        info <- listA parseTable -< table
        returnA -< mkThing name info
    
    
    parseTable :: ArrowXml a => a XmlTree (String, String)
    parseTable = error "Not implemented"
    
    
    mkThing :: String -> [(String,String)] -> Thing
    mkThing n i = Thing n i
    

    Side note 1

    In case you would like to zip tables and headers instead of making cross product of them, you can change parseThing to

    parseThing
      = (listA $ deep $ hasName "table") &&& (listA $ deep $ hasName "h3")
      >>> arr2 zip >>> unlistA
      >>> catchA parseThing_aux none
    

    Side note 2

    It is a little bit more concise to write parseThing_aux in point-free style

    parseThing_aux
      = (getChildren >>> getText) *** listA parseTable
      >>^ uncurry mkThing
    

    (You can eliminate uncurry by changing mkThing to take a pair.)