Search code examples
exist-dbtei

exist-db: XQuery and documents with XInclude


I'm embarking on a new project with eXist. We'll be storing a few hundred TEI XML documents that represent manuscripts. A number of things we want to capture are repetitve, mainly people and places. My colleague has asked the TEI community about strategies for representing what we want to capture and using XInclude had been suggested as a way of reducing duplication.

I've had a quick play with adding an XInclude into a document and the serialized XML does render the include XML file. However, the included text was missing from an XQuery. I notice in the eXist docs (http://exist-db.org/exist/apps/doc/xinclude.xml) that:

eXist-db expands XIncludes at serialization time, which means that the query engine will see the XInclude tags before they are expanded. You therefore cannot query across XIncludes - unless you create your own code (e.g. an XQuery function) for it. We would certainly like to support queries over xincluded content in the future though.

What is the best practice for querying files that use XInclude?

I'm wondering whether I should have a 'job' that serializes the source TEI XML files to expand the XIncludes and store these files in a separate collection? In that case, would file:serialize be the correct function for this task?

We are at the start of the project, so any advice appreciated.


Solution

  • Can you describe what kind of query you tried that was missing the text?

    Generally, since the files referenced via XInclude are well-formed xml documents, you can use collections (folders) to organise your queries in exist-db. So instead of for $search in doc("mydoc.xml") you could for $search in collection('/app/mydata')/*

    more elaborate answers would follow the attribute of the unexpanded xinclude statement in source document and find the matching element in the target, but its difficult to abstract that without a concrete MWE.

    have you tried to create a temporary and expanded fragment in a let clause, and query that instead of the stored xml? Beware of namespaces !

    Hope this helps, and greetings to Sebastiaan.