Search code examples
xmlhaskellhxtio-monad

Parse other XML files whilst parsing tree with HXT


I am parsing XML files that describe User Interface for a game and try to learn HXT at the same time. I can successfully parse a single XML file. But could not figure what would be the best way to open and parse other XML files whilst inside getWindow function.

Each XML consists of number of Windows. Each Window has name and libraryName. The latter being the name of the XML file that describes the window. For instance, the root looks like this:

<!-- DOMDocument.xml -->
<elements>
    <Window libraryItemName="window_home" name="window_home">
    <!-- data here  -->
    </Window>
    <Window libraryItemName="window_battle" name="window_battle">
    <!-- data here  -->
    </Window>
</elements>

And then there is a separate XML file for each window. E.g. "window_home":

<!-- window_home.xml -->
<elements>
    <Window libraryItemName="panel_tabs" name="panel_tabs" selected="true">
    <!-- data here  -->
    </Window>
    <Window libraryItemName="home_powerup_menu" name="home_powerup_menu" selected="true">
    <!-- data here  -->
    </Window>
    <Window libraryItemName="panel_name" name="panel_name" selected="true">
    <!-- data here  -->
    </Window>
</elements>

I parse root DOMDocument.xml with this code:

{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
import Text.XML.HXT.Core

parseXML = readDocument [ withValidate no
                        , withRemoveWS yes  -- throw away formating WS
                        ] 

atTag tag = deep (isElem >>> hasName tag)

data UiWindow = UiWindow {
    wndName :: String,
    wndNameLib :: String,
    wndChildren :: [UiWindow]
    } deriving (Show)

initUiWindow = UiWindow {
    wndName = "empty",
    wndNameLib = "",
    wndChildren = []
    }


getWindow = atTag "Window" >>> 
  proc x -> do
    _libraryItemName <- getAttrValue "libraryItemName" -< x
    _name <- getAttrValue "name" -< x
    -- TODO: Open _libraryItemName XML file and parse windows in it
    returnA -< initUiWindow { wndName = _name, wndNameLib = _libraryItemName}

documentName = "DOMDocument.xml"        

parseRoot = parseXML documentName
--runX (parseRoot >>> getWindow )

Since the getWindow function is not wrapped inside IO what would be the best way to achieve a desired behaviour?


Solution

  • The HXT combinators are polymorphic, and there is a type IOLA which implements all the XML parsing-relevant typeclasses, in addition to ArrowIO which makes it possible to do IO mid-arrow.

    For example, if you want to do fully recursive parsing of the files, you can do something as simple as

    parseDoc docName = runX $ parseXML fileName >>> getWindow
      where
        fileName = docName ++ ".xml"
    
    getWindow = atTag "Window" >>> proc x -> do
        libraryItemName <- getAttrValue "libraryItemName" -< x
        name <- getAttrValue "name" -< x
        children <- arrIO parseDoc -< libraryItemName
        returnA -< initUiWindow { wndName = name, wndNameLib = libraryItemName, wndChildren = children}