Search code examples
xsltxqueryexist-dbtei

eXist DB and Xquery : xincludes or collections (TEI-XML)?


I have a corpus in TEI-XML which uses a 'master' corpus XML document that then contains, via xi:include, thousands of other documents. Each of these documents themselves contain xi:includes to master lists of named entities (people, places, etc linked by xml:ids) . All of this works very well in XSLT (and in my IDE Oxygen for fast encoding).

I am now embarking on building a website using eXist-DB applications. I am rewriting everything directly in Xquery (to replace XSLT), and I have hit upon an unexpected decision. I am used to using xi:includes to traverse the corpus and the various XMLs files. But reading the documentation of eXist DB, it seems that the encouraged practice is to use collections and query them directly, instead of navigating via xi:includes. It also seems that eXist-DB does not support the full implementation of xi:includes anyway and requires some work arounds?

I am looking for guidance as to best practices of eXist-DB/Xquery in this context.

Many thanks in advance.


Solution

  • Correct, eXist's XInclude implementation is focused on output (i.e., serialization) rather than on querying or indexing. As eXist's documentation page on XInclude states:

    The XInclude processor is implemented as a filter in between the serializer's output event stream and the receiver... XInclude processing is therefore applied whenever eXist-db serializes an XML fragment, whether it's a document, the result of an XQuery or an XSLT stylesheet.

    Thus, if you use XInclude to assemble your corpus and you want to query/traverse this corpus, you could do so by (1) writing a query to read your XInclude and following it like a map to find the component documents, (2) pre-serializing your data into a new document and then querying the resulting document directly, or (3) placing the documents into collections that facilitate the kinds of queries you want to do.