Search code examples
marklogicmlcp

Error parsing HTTP headers exception while copying data using MLCP


I am trying to copy large set of data from one database to other database using MLCP but I am getting following exception.

2019-08-30 11:53:54.847 SEVERE [15] (StreamingResultSequence.next): IOException instantiating ResultItem 130891: Error parsing HTTP headers: Premature EOF, partial header line read: 'X-URI: /integration/test%2BItem%2BBarcode%2BCross%2BR'
 java.io.IOException: Error parsing HTTP headers: Premature EOF, partial header line read: 'X-URI: /integration/test%2BItem%2BBarcode%2BCross%2BR'
        com.marklogic.http.HttpHeaders.parsePlainHeaders(HttpHeaders.java:317)
        com.marklogic.http.MultipartBuffer.next(MultipartBuffer.java:103)
        com.marklogic.xcc.impl.AbstractResultSequence.instantiateResultItem(AbstractResultSequence.java:132)
        com.marklogic.xcc.impl.StreamingResultSequence.next(StreamingResultSequence.java:147)
        com.marklogic.xcc.impl.StreamingResultSequence.next(StreamingResultSequence.java:166)
        com.marklogic.contentpump.DatabaseContentReader.nextKeyValue(DatabaseContentReader.java:518)
        com.marklogic.contentpump.LocalJobRunner$TrackingRecordReader.nextKeyValue(LocalJobRunner.java:498)
        org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        com.marklogic.contentpump.BaseMapper.run(BaseMapper.java:78)
        com.marklogic.contentpump.LocalJobRunner$LocalMapTask.call(LocalJobRunner.java:411)
        java.util.concurrent.FutureTask.run(Unknown Source)
        java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        java.lang.Thread.run(Unknown Source)

please help me to understand and fix the issue.

MLCP command that I am using -

mlcp copy -mode local -input_host 192.168.1.46 -input_port 9000 -input_username admin -input_password admin -input_database test  -output_host localhost -output_port 8000 -output_username admin -output_password admin -output_database test

Solution

  • It sounds like you may be hitting some timeout conditions. There is a MarkLogic knowledgebase article describing common scenarios in which this can occur and a few things to try to help avoid them.

    https://help.marklogic.com/Knowledgebase/Article/View/75/0/explaining-and-preventing-premature-eof-errors-when-loading-content-via-xcc

    The premature EOF exception generally occurs in situations where a connection to a particular application server connection was lost while the XCC driver was in the process of reading a result set. This can happen in a few possible scenarios:

    • The host became unavailable due to a hardware issue, segfault or similar issue;
    • The query timeout expired (although this is much more likely to yield an XDMP-EXTIME exception with a "Time limit exceeded" message);
    • Network interruption - a possible indicator of a network reliability problem such as a misconfigured load balancer or a fault in some other network hardware.

    Configuration / Code: things to try when you first see this message

    • A possible cause of errors like this may be due to the JVM starting garbage collection and this process taking long enough as to exceed the server timeout setting. If this is the case, try adding the -XX:+UseConcMarkSweepGC java option

    • Setting the "keep-alive" value to zero for the affected XDBC application server will disable socket pooling and may help to prevent this condition from arising; with keep-alive set to zero, sockets will not be re-used. With this approach, it is understood that disabling keep-alive should not be expected to have a significant negative impact on performance, although thorough testing is nevertheless advised.