Search code examples
couchdbcloudant

make couchDB calculate an average and keep it up to date when adding values


Is it possible to make couchDB (Cloudant) calculate the average rating value of each document ? If yes, how to do this in Cloudant ?

Thanks.

{
  "_id": "2016-02-06T13:16:30.515Z",
  "_rev": "3-7521b9e21fbb58d5393a76d08b12ab12",
  "modelStrict": "Taj model",
  "ratings": {
    "2016-02-06T13:36:04.671Z": {
      "userEmail": "[email protected]",
      "rating": 1,
      "sessionNumber": 0
    },
    "2016-02-06T13:46:04.671Z": {
      "userEmail": "[email protected]",
      "rating": 3,
      "sessionNumber": 0
    },
    "2016-02-06T13:53:04.671Z": {
      "userEmail": "[email protected]",
      "rating": 3,
      "sessionNumber": 0
    }, 
    "2016-02-06T13:47:04.671Z": {
      "userEmail": "[email protected]",
      "rating": 5,
      "sessionNumber": 0
    }
  },
  "averageRating": ...
}

And how to keep this averageRating up to to date when a rating is added ?

An added rating looks like this:

    "2016-02-06T17:57:04.671Z": {
      "userEmail": "[email protected]",
      "rating": 2,
      "sessionNumber": 10
    }

Solution

  • CouchDB doesn't do computed fields in documents. However, map/reduce is perfect for doing this calculation incrementally, because CouchDB is very efficient with it's map/reduce calculations. (you can read more about their design decisions and their impact on performance here)

    My recommendation would be to create separate documents for each rating:

    {
      "modelStrict": "Taj model",
      "datetime": "2016-02-06T13:36:04.671Z",
      "userEmail": "[email protected]",
      "rating": 1,
      "sessionNumber": 0
    }
    
    {
      "modelStrict": "Taj model",
      "datetime": "2016-02-06T13:46:04.671Z",
      "userEmail": "[email protected]",
      "rating": 3,
      "sessionNumber": 0
    }
    
    {
      "modelStrict": "Taj model",
      "datetime": "2016-02-06T13:53:04.671Z",
      "userEmail": "[email protected]",
      "rating": 3,
      "sessionNumber": 0
    }
    
    {
      "modelStrict": "Taj model",
      "datetime": "2016-02-06T13:47:04.671Z",
      "userEmail": "[email protected]",
      "rating": 5,
      "sessionNumber": 0
    }
    

    Then, you can use a map function like this, which will group the ratings by modelStrict (which appears to be your key)

    function (doc) {
      emit(doc.modelStrict, doc.rating);
    }
    

    Then, you can use a reduce function for calculating the average: (I modified the reduce function from here

    function(keys, values, rereduce) {
        if (!rereduce) {
            var length = values.length
            return [sum(values) / length, length]
        } else {
            var length = sum(values.map(function(v){return v[1]}));
            var avg = sum(values.map(function(v){
                return v[0] * (v[1] / length)
            }));
            return [avg, length]
        }
    }
    

    When you call this view, you'll get [ 3, 4 ], which is the average as well as the number of values used to compute that average. I believe you need to return both so both reduce and rereduce have enough context to make the calculation, just one of those weird CouchDB-isms.

    As you use CouchDB more and more, you'll find yourself using more documents rather than merging a lot into the same document. This makes writing views more flexible and is much more efficient on disk usage, but of course experiment with what works best for your own application. I hope this helps!