With the following code
{-# LANGUAGE Arrows #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
import Text.XML.HXT.Core
parseXml :: IOSArrow XmlTree XmlTree
parseXml = getChildren >>> getChildren >>>
proc x -> do
y <- x >- hasName "item"
returnA -< x
main :: IO ()
main = do
person <- runX (readString [withValidate no]
"<xml><item>John</item><item2>Smith</item2></xml>"
>>> parseXml)
putStrLn $ show person
return ()
I get the output
[NTree (XTag "item" []) [NTree (XText "John") []]]
So it seems that hasName "item"
was applied to x
which I did not expect. Using arrowp I get for parseXml
:
parseXml
= getChildren >>> getChildren >>>
(arr (\ x -> (x, x)) >>>
(first (hasName "item") >>> arr (\ (y, x) -> x)))
So I have the arrow diagram
y
/-- hasName "item" ---
x /
-- getChildren -- getChildren ---\x->(x,x) \(y,x)->x --- final result
\ /
\---------------------/
Why is hasName "item"
also applied to second place of the tuple? I thought there is no state in haskell and hasName "item" x
returns a new object instead of changing the internal state of x
.
Related question: Is factoring an arrow out of arrow do notation a valid transformation?
I have the following code:
{-# LANGUAGE Arrows #-}
import Text.XML.HXT.Core
data Person = Person { forname :: String, surname :: String } deriving (Show)
parseXml :: IOSArrow XmlTree Person
parseXml = proc x -> do
forname <- x >- this /> this /> hasName "fn" /> getText
surname <- x >- this /> this /> hasName "sn" /> getText
returnA -< Person forname surname
main :: IO ()
main = do
person <- runX (readString [withValidate no]
"<p><fn>John</fn><sn>Smith</sn></p>"
>>> parseXml)
putStrLn $ show person
return ()
If I run it everything works fine and I get the output
[Person {forname = "John", surname = "Smith"}]
But if I change the parseXml
to avoid the this
statements
parseXml :: IOSArrow XmlTree Person
parseXml = (getChildren >>> getChildren) >>> proc x -> do
forname <- x >- hasName "fn" /> getText
surname <- x >- hasName "sn" /> getText
returnA -< Person forname surname
no person can be parsed anymore (output is []
). Investigating the problem with
parseXml :: IOSArrow XmlTree Person
parseXml = (getChildren >>> getChildren) >>>
proc x -> do
forname <- x >- withTraceLevel 5 traceTree >>> hasName "fn" /> getText
surname <- x >- hasName "sn" /> getText
returnA -< Person forname surname
I got the output
content of:
============
---XTag "fn"
|
+---XText "John"
content of:
============
---XTag "sn"
|
+---XText "Smith"
[]
So everything seems fine, but with the code
parseXml :: IOSArrow XmlTree Person
parseXml = (getChildren >>> getChildren) >>>
proc x -> do
forname <- x >- hasName "fn" /> getText
surname <- x >- withTraceLevel 5 traceTree >>> hasName "sn" /> getText
returnA -< Person forname surname
I got
content of:
============
---XTag "fn"
|
+---XText "John"
[]
So it seems to me, that the value of the input x
changes between the two statements. It looks like the hasName "fn"
was applied to x
before it was attached to the surname
arrow. But shall x
not remain the same between the two lines?
No, the input can't change and it doesn't.
What you've programmed in the lines
proc x -> do
y <- x >- hasName "item"
returnA -< x
is just a filter removing all nodes not named item
.
His is equivalent to the arrow
hasName "item" `guards` this
You can test this with
{-# LANGUAGE Arrows #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
module Main where
import Text.XML.HXT.Core
parseXml0 :: IOSArrow XmlTree XmlTree
parseXml0 = getChildren >>> getChildren >>>
proc x -> do
_ <- hasName "item" -< x
returnA -< x
parseXml1 :: IOSArrow XmlTree XmlTree
parseXml1 = getChildren >>> getChildren >>>
(hasName "item" `guards` this)
main1 :: Show c => IOSArrow XmlTree c -> IO ()
main1 parseXml = do
person <- runX (readString [withValidate no]
"<xml><item>John</item><item2>Smith</item2></xml>"
>>> parseXml)
putStrLn $ show person
return ()
main :: IO ()
main = main1 parseXml0 >> main1 parseXml1