I'm using the Scales library to process XML in Scala, and I'd like to have xi:include
elements expanded during parsing. This doesn't happen by default—if I write the following, for example, any include
elements in the XML file will appear unexpanded in the parsed document:
import java.io.FileReader
import scales.utils._, ScalesUtils._
import scales.xml._, ScalesXml._
val doc = loadXml(new FileReader("data/example.xml"))
Is it possible to have these elements expanded? I'm using Scales 0.6.0-M1.
Yes! It's at least possible. You can write your own SAXParserFactory
pool on the model of the provided DefaultSAXParserFactoryPool
:
import javax.xml.parsers.{ SAXParser, SAXParserFactory }
import scales.utils._, ScalesUtils._
import scales.xml._, ScalesXml._
import scales.xml.parser.sax.DefaultSaxSupport
object XIncludeSAXParserFactoryPool extends
resources.SimpleUnboundedPool[SAXParserFactory] { pool =>
def create = {
val parserFactory = SAXParserFactory.newInstance()
parserFactory.setNamespaceAware(true)
parserFactory.setFeature("http://xml.org/sax/features/namespaces", true)
parserFactory.setXIncludeAware(true)
parserFactory.setValidating(false)
parserFactory
}
val parsers = new resources.Loaner[SAXParser] with DefaultSaxSupport {
def loan[X](tThunk: SAXParser => X): X =
pool.loan(x => tThunk(x.newSAXParser))
}
}
Then you can specify the parser pool in your loadXml
call:
val doc = loadXml(
source = new FileReader("data/example.xml"),
parsers = XIncludeSAXParserFactoryPool.parsers
)
Note though that if you have relative URIs in your include
href
s and you want them evaluated relative to the location of the document (rather than the current directory), you'll need to be sure that the InputSource
gets the system ID. Here's one way to do this:
import java.io.File
import org.xml.sax.InputSource
loadXml(
source = new InputSource(new File("data/example.xml").toUri.toString),
parsers = XIncludeSAXParserFactoryPool.parsers
)
All of the above should work in 0.5.0 as well as 0.6.0.
If there are better ways to solve this problem, I'd love to hear them.