Search code examples
marklogicmarklogic-9mlcp

MLCP load compressed xml files and skip xml files with a specific xml tag


I am trying to load xml gzip files and I am breaking the input xml to multiple xml records.. But is there a way in mlcp to ignore a record while loading if a particular xml tag or xml value is present. If not what are my other options ?

Following is my options that I am using now to load the gzip xml file and break to individual records

import
-host
xxxxx
-port
xxxx
-username
xxxx
-password
xxxx
-batch_size
1
-input_compressed
true
-input_compression_codec
gzip
-input_file_type
aggregates
-output_collections
wos
-output_permissions
rest-reader,read,rest-writer,update
-output_uri_prefix
/wos/
-output_uri_suffix
.xml
-aggregate_record_element
REC
-aggregate_record_namespace
http://xxxx.yyyy.com
-uri_id
UID

Solution

  • I can only think of using an MLCP transform (-transform_module e.a.) in which you conditionally pass through the $content map:map. Return empty sequence if you want to have a particular aggregate fragment suppressed.

    HTH!