Search code examples
pythoncloudant

Python - Cloudant Get Changes


I'm using the library Cloudant in order to gather documents from a Cloudant Database. Everytime I run the python script I get all the documents but I would like to retrieve only the documents added from the last execution of the script, in other words a get_changes function.

I have searched for an answer but it not seems to be easely to find.

Thaks for any help,

Filippo.


Solution

  • Use the changes() method. Keep track of the last sequence id, and restart from there to retrieve only the unseen changes.

    # Iterate over a "normal" _changes feed
    changes = db.changes()
    for change in changes:
        print(change)
    
    # ...time passes
    new_changes = db.changes(since=changes.last_seq)
    for new_change in new_changes:
        print(new_change)
    

    If you also want the doc body, you can pass include_docs=True.

    See https://github.com/cloudant/python-cloudant/blob/master/src/cloudant/database.py#L458

    If you want to capture only new additions (as opposed to all changes), you can either create a filter function in a db design doc along the lines of:

    function(doc, req) {
        // Skip deleted docs
        if (doc._deleted) {
            return false;
        }
        // Skip design docs
        if (doc._id.startsWith('_design')) {
            return false;
        }
    
        // Skip updates
        if (!doc._rev.startsWith('1-')) {
            return false;
        }
    
        return true;
    }
    

    and apply that to the changes feed:

    new_changes = db.changes(since=changes.last_seq, filter='myddoc/myfilter'):
        # do stuff here
    

    but probably as easy to simply get all the changes and filter in the Python code.

    Filter functions: https://console.bluemix.net/docs/services/Cloudant/guides/replication_guide.html#filtered-replication