I've been trying to use the mlcp script to load RDF dataset, composed of 2091 nquads, representing a total of 727Mio triples. I've used this command so far:
$ mlcp.sh import -username <myusername> -password <mypwd> -host localhost - port 8000 -input_file_path /home/to/path/ -output_override_graph http://mynamedgraph -mode local -input_file_type rdf
The error I got after 3205 sec is the following: "XDMP-FORESTERR: Error in merge of forest Documents: SVC-FILWRT: File write error: write '/var/opt/MarkLogic/Forests/Documents/00000101/TreeData': No space left on device" (details here [1]) . However I still do have enough space in my disk (28G left).
What is strange in the command is that I don't see where to pass the dataset name.
Please, what Am I doing wrong?
If your merge max size is set to the default 32Gb, and you only have 28Gb then it could try to accomplish a merge and not have enough space. Also, if you checked disk space after the merge failed, it had already cleaned up the files from the merge.
It's important to remember that a merge is handled as a single transaction, if it runs out of space mid-transaction, it will roll-back and the files created during the transaction will be deleted.
MarkLogic recommends having at enough free space to accommodate merging, typically around 50% larger than your database.