Search code examples
marklogicmarklogic-8

using xdmp:document-insert and specify xml encoding


I have a xquery endpoint which loads the file coming into the database using xdmp:document-insert.. and this fails when trying to upload an XML document which has "ISO-8859-1" encoding.. Following is my code

declare  %rapi:transaction-mode("update") function repoLoad:post($context as map:map, $params  as map:map,$input as document-node()*) as document-node()*
{

    let $filename := xdmp:get-request-field-filename("upload")
    let $contentType := xdmp:get-request-field-content-type("upload")

    let $uri := "/documents/"

    let $_ := xdmp:document-insert($uri, xdmp:get-request-field("upload"),(xdmp:default-permissions()), ("raw"))

    return $uri

};

This fails for xml documents which are not UTF8 encoded.. I get the following exception.. any workaround ??

Error: AppRequestTask::run: XDMP-DOCUTF8SEQ: Invalid UTF-8 escape sequence at line 1 -- document is not UTF-8 encoded


Solution

  • If you can generate the payload with an XML prolog that declares the encoding, that should work:

    <?xml version="1.0" encoding="ISO-8859-1">
    ... rest of the document ...
    

    Otherwise, you could try something like the following on the server to generate the XML document for insertion:

    xdmp:document-insert(
        $uri,
        xdmp:binary-decode(
            xdmp:unquote(
                xdmp:get-request-field("upload"), (), "format-binary"
                ),
            "ISO-8859-1"
            ),
        ... collection, permissions, and other arguments ...
        )
    

    Hoping that helps,