Search code examples
couchdb

CouchDB design view not returning all the records older than N seconds


I'm experiencing a problem using CouchDB design which should return records older than 5 seconds based on created property of documents. The documents are continuously added to the mydb database.

For some reason the documents with lesser created date than current date - 5 seconds are not returned.

This is how I add a document:

...
timestamp==$(echo $(( $(date +%s) * 1000 )))
...
curl -X POST -H "Content-Type: application/json" -d "{\"created\": ${timestamp}}" http://localhost:5984/mydb -u "admin:YOURPASSWORD" 

How I query using the view:

curl -X GET -u "admin:YOURPASSWORD" "http://localhost:5984/mydb/_design/nonce/_view/older_than?limit=100"

If added just a few documents, then things work ok, but repeatedly doing so it doesn't returned expected number of records.

Any ideas what I might be doing wrong?

Here is the example project specifically created to learn what I'm doing wrong on GitHub (above examples automated and a corresponding GO method that should delete returned records:

https://github.com/igorrendulic/couchdb-test


Solution

  • It is key to understand that the emit function is invoked during indexing of a document at creation or update time rather than at query time.

    The emit function

    function(doc) 
    { 
        var now = Date.now() - (5 * 1000);
        if (doc.created < now) {
            emit(doc.created, doc._rev); 
        }
    }
    

    Is subtracting 5 seconds from the document's created field (notably assuming that field exists) and indexing the value, which does not seem useful.

    If the solution is to return the most recent documents that have a created field value that is 5 or more seconds less than current time, then this simple emit will suffice (and may provide more utility for other queries)

    function(doc) 
    {    
        if (doc.created) {
            emit(doc.created, doc._rev); 
        }
    }
    

    CouchDB's collation specification result in smaller numeric values to be at the beginning of the index so the query needs to reverse the sort and provide the desired window of time, say in this case current time less 5 seconds:

    ?descending=true&start_key={epoch_time_ms}&limit=100
    

    That query will return up to 100 documents with created less than or equal to epoch_time_ms with the most recent document first in the result set.

    I must add, storing the _rev seems nebulous as that information can be gained from the result document.