Search code examples
marklogicmlcp

Can we replace the default document uri to a value from the document itself during mlcp ingestion in MarkLogic


I want to replace the default document uri of the file to a value from the file's content.

For example - the default uri is /test/Invoice.xml

I want to replace the doc uri to

/Invoice_{current date time from file from field DateCreated}.xml

The file looks like this

<?xml version="1.0" encoding="UTF-8"?>
<Test xsi:noNamespaceSchemaLocation="file:///D:/Mapforce/Projects/schema/Test.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <ID>f1258d4ae0df43d5a1e05ce9139f0ed2</ID>
    <SystemRef>22000041</SystemRef>
    <DateCreated>2022-09-06T19:07:46.3492849+01:00</DateCreated>
    <TimeSaved>240</TimeSaved>
    <ManyReasons/>
    <SubmissionUser>System</SubmissionUser>
    <InternalBusinessUnit>Finance</InternalBusinessUnit>
    <Direction>Inbound</Direction>
</Test>

How can I do it using mlcp ?


Solution

  • Controlling Database URIs During Ingestion

    By default, the document URIs created by mlcp during ingestion are determined by the input source. The tool supports several command line options for modifying this default behavior.

    • Transforming the Default URI

      Use the following options to tailor the database URI of inserted documents:

      • -output_uri_replace performs one or more string substitutions on the default URI.
      • -output_uri_prefix prepends a string to the URI after substitution.
      • -output_uri_suffix appends a string to the URI after substitution.

      The -output_uri_replace option accepts a comma delimited list of regular expression and replacement string pairs.

    If you are applying a custom transformation, then you can also control the URI of the document. Inside of the transform method, set the uri property of the $content map with whatever value you want. i.e. map:put($content, "uri", "myCustomURI.xml"). See: Example: Changing the URI and Document Type

    So, in your custom transform you could XPath to the DateCreated element and let a variable:

    let $created := map:get($content, "value")/Test/DateCreated
    

    and then use it to construct the desired URI (may want to normalize/format the DateCreated value for a clean URI)

    map:put($content, "/Invoice_"||$created||".xml")