I have a scheduled Marklogic task everyday where I access an S3 bucket, process a file (test.xml) in a directory and then add a flag file (test.done) to the same directory to notify that the file is processed. I need to delete the files (both test.xml and test.done) periodically based on the availability of flag file. Is there an option in amazon to create a job which deletes these files periodically?
Is there an option to use xdmp:http-delete()? If so can some one share a sample request with header to do it?
In MarkLogic, there is no supported way to delete files or directories. However, you can zero-out their content by writing an empty text node to them.
I said no 'supported' way. However, there are two function in MarkLogic that exist: xdmp:filesystem-directory-delete and xdmp:filesystem-file-delete. They are undocumented, which is also an indicator that they are unsupported and subject to change or removal, I believe. So I would caution the use of these for production.
To delete the files via HTTP, check out the API for deleting via AWS: http://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjects.html
Another option is to mount S3 to the local file-system of the machine running MarkLogic and use the system to delete the files. In this case, you could also have MarkLogic write the test.done flag to a directory on the local filesystem in the form of a queue and process them from the OS.