I am learning haskell arrows in case of parsing simple html page.
The task is to download site of base region baseRegion = Region "Yekaterinburg" "http://example.com/r/ekb"
, parse links to another regions (via hxt):
regions :: ArrowXml cat => cat a (NTree XNode) -> cat a Region
regions tree =
tree >>> multi (hasName "a" >>> hasAttrValue "class" (== ".regionlink")) >>>
proc x -> do
rname <- getText <<< getChildren -< x
rurl <- getAttrValue "href" -< x
returnA -< Region rname rurl
and append a base region to the result:
allRegions :: ArrowXml cat => cat a (NTree XNode) -> cat a Region
allRegions
? Or, better, where should I dig to write it?regions
's result but insert baseRegion
to some particular place of regions list (for example after the second element or after an element whose name is starting with 'E')?I think the combinator you are looking for is (>>.)
in the ArrowList
type-class. It allows you to apply any list function on the arrow. E.g. prepending an element to the front of the arrow would be.
regions tree >>. (baseRegion:)
So as for your second question, you can write an utility function to insert the region in the list to the correct spot e.g. something with a signature like
insertRegion :: Region -> [Region] -> [Region]
and then you can use it on the arrow
regions tree >>. insertRegion baseRegion
Btw I would personally remove the tree parameter from your regions
function and just use explicit arrow chaining so the above becomes.
tree >>> regions >>. insertRegion baseRegion