Search code examples
xmlscalaxml-parsingxincludescales-xml

XInclude and Scales XML


I'm using the Scales library to process XML in Scala, and I'd like to have xi:include elements expanded during parsing. This doesn't happen by default—if I write the following, for example, any include elements in the XML file will appear unexpanded in the parsed document:

import java.io.FileReader
import scales.utils._, ScalesUtils._
import scales.xml._, ScalesXml._

val doc = loadXml(new FileReader("data/example.xml"))

Is it possible to have these elements expanded? I'm using Scales 0.6.0-M1.


Solution

  • Yes! It's at least possible. You can write your own SAXParserFactory pool on the model of the provided DefaultSAXParserFactoryPool:

    import javax.xml.parsers.{ SAXParser, SAXParserFactory }
    import scales.utils._, ScalesUtils._
    import scales.xml._, ScalesXml._
    import scales.xml.parser.sax.DefaultSaxSupport
    
    object XIncludeSAXParserFactoryPool extends
      resources.SimpleUnboundedPool[SAXParserFactory] { pool =>
    
      def create = {
        val parserFactory = SAXParserFactory.newInstance()
        parserFactory.setNamespaceAware(true)
        parserFactory.setFeature("http://xml.org/sax/features/namespaces", true)
        parserFactory.setXIncludeAware(true)
        parserFactory.setValidating(false)
        parserFactory
      }
    
      val parsers = new resources.Loaner[SAXParser] with DefaultSaxSupport {
        def loan[X](tThunk: SAXParser => X): X =
          pool.loan(x => tThunk(x.newSAXParser))
      }
    }
    

    Then you can specify the parser pool in your loadXml call:

    val doc = loadXml(
      source = new FileReader("data/example.xml"),
      parsers = XIncludeSAXParserFactoryPool.parsers
    )
    

    Note though that if you have relative URIs in your include hrefs and you want them evaluated relative to the location of the document (rather than the current directory), you'll need to be sure that the InputSource gets the system ID. Here's one way to do this:

    import java.io.File
    import org.xml.sax.InputSource
    
    loadXml(
      source = new InputSource(new File("data/example.xml").toUri.toString),
      parsers = XIncludeSAXParserFactoryPool.parsers
    )
    

    All of the above should work in 0.5.0 as well as 0.6.0.

    If there are better ways to solve this problem, I'd love to hear them.