I am iterating over list of tables and inside a proc
I am using function which may throw an exception (I can't change its behaviour):
parseThing :: ArrowXml a => a XmlTree Thing
parseThing = deep (hasName "table") &&& deep (hasName "h3") >>> proc (table, h3) -> do
name <- getText <<< getChildren -< h3
info -< listA parseTable -< table
returnA $ mkThing name info -- may throw an exception
where parseTable :: ArrowXml a => a XmlTree (String, String)
parses <tr><td>key</td><td>value</td></tr>
into a (key, value)
tuple and mkThing :: String -> [(String, String)] -> Thing
builds Thing by name
and info
and may throw an exception (it depends on both name and info)
How can I skip processing the thing which cannot be build? If I leave exception uncaught it converts to string somehow and walks into output.
You can use catchA
arrow to handle exceptions inside the pipeline. In that case parseThing
needs to be of more specific IO-aware type like IOSArrow
.
To skip exceptions, catchA
is given none
arrow as an exception handler.
Here is a full example (I extracted parseThing_aux
from parseThing
for better readability):
{-# LANGUAGE Arrows #-}
import Control.Arrow.ArrowExc (catchA)
import Text.XML.HXT.Core
data Thing = Thing String [(String,String)] deriving Show
main :: IO ()
main = do
xml <- getContents
res <- runX $ readString [] xml >>> parseThing
mapM_ print res
parseThing :: IOSArrow XmlTree Thing
parseThing
= deep (hasName "table") &&& deep (hasName "h3")
>>> catchA parseThing_aux none
parseThing_aux :: IOSArrow (XmlTree, XmlTree) Thing
parseThing_aux = proc (table, h3) -> do
name <- getText <<< getChildren -< h3
info <- listA parseTable -< table
returnA -< mkThing name info
parseTable :: ArrowXml a => a XmlTree (String, String)
parseTable = error "Not implemented"
mkThing :: String -> [(String,String)] -> Thing
mkThing n i = Thing n i
In case you would like to zip tables and headers instead of making cross product of them, you can change parseThing
to
parseThing
= (listA $ deep $ hasName "table") &&& (listA $ deep $ hasName "h3")
>>> arr2 zip >>> unlistA
>>> catchA parseThing_aux none
It is a little bit more concise to write parseThing_aux
in point-free style
parseThing_aux
= (getChildren >>> getText) *** listA parseTable
>>^ uncurry mkThing
(You can eliminate uncurry
by changing mkThing
to take a pair.)