I'd like iterate over every document in a (probably big) Lotus Domino database and be able to continue it from the last one if the processing breaks (network connection error, application restart etc.). I don't have write access to the database.
I'm looking for a way where I don't have to download those documents from the server which were already processed. So, I have to pass some starting information to the server which document should be the first in the (possibly restarted) processing.
I've checked the AllDocuments
property and the DocumentColletion.getNthDocument
method but this property is unsorted so I guess the order can change between two calls.
Another idea was using a formula query but it does not seem that ordering is possible with these queries.
The third idea was the Database.getModifiedDocuments
method with a corresponding Document.getLastModified
one. It seemed good but
it looks to me that the ordering of the returned collection is not documented and based on creation time instead of last modification time.
Here is a sample code based on the official example:
System.out.println("startDate: " + startDate);
final DocumentCollection documentCollection =
database.getModifiedDocuments(startDate, Database.DBMOD_DOC_DATA);
Document doc = documentCollection.getFirstDocument();
while (doc != null) {
System.out.println("#lastmod: " + doc.getLastModified() +
" #created: " + doc.getCreated());
doc = documentCollection.getNextDocument(doc);
}
It prints the following:
startDate: 2012.07.03 08:51:11 CEDT
#lastmod: 2012.07.03 08:51:11 CEDT #created: 2012.02.23 10:35:31 CET
#lastmod: 2012.08.03 12:20:33 CEDT #created: 2012.06.01 16:26:35 CEDT
#lastmod: 2012.07.03 09:20:53 CEDT #created: 2012.07.03 09:20:03 CEDT
#lastmod: 2012.07.21 23:17:35 CEDT #created: 2012.07.03 09:24:44 CEDT
#lastmod: 2012.07.03 10:10:53 CEDT #created: 2012.07.03 10:10:41 CEDT
#lastmod: 2012.07.23 16:26:22 CEDT #created: 2012.07.23 16:26:22 CEDT
(I don't use any AgentContext
here to access the database. The database object comes from a session.getDatabase(null, databaseName)
call.)
Is there any way to reliably do this with the Lotus Domino Java API?
If you have access to change the database, or could ask someone to do so, then you should create a view that is sorted on a unique key, or modified date, and then just store the "pointer" to the last document processed.
Barring that, you'll have to maintain a list of previously processed documents yourself. In that case you can use the AllDocuments property and just iterate through them. Use the GetFirstDocument and GetNextDocument as they are reportedly faster than GetNthDocument.
Alternatively you could make two passes, one to gather a list of UNIDs for all documents, which you'll store, and then make a second pass to process each document from the list of UNIDs you have (using GetDocumentByUNID method).