Search code examples
jsonnode.jscouchdbcloudant

Invalid UTF-8 JSON when sorting a Cloudant/CouchDB search


I received this error in particular when using Cloudant with NodeJS.

I created a Search Index in my database, and I tried to do the following search:

// Query to get last updated entries in database
var query = {
    q: "*:*",
    group_field: "type",
    sort: "-timestamp<number>",
    limit: 1
};

db.search('mydesigndoc', 'mysearchindex', query, function(err, data) {
    if (!err) {
        console.log(JSON.stringify(data));
    } else {
        console.log("No bio data found: " + err);
    }
});

However, the response returned is this:

{"error":"bad_request","reason":"invalid UTF-8 JSON"}

sort - A JSON string of the form "fieldname<type>" or -fieldname<type> for descending order, where fieldname is the name of a string or number field and type is either number or string or a JSON array of such strings.

Last time I checked, "-timestamp<number>" is a JSON string.

What did I do wrong?


Solution

  • The problem lies in the handling of the query object. When sent to the server, we can assume the literal value of each of these properties.

    For example, if the function internally sent a URL form encoded request to Cloudant, we'd see something like the following:

    /_design/mydesigndoc/_search/mysearchindex?q=*%3A*&group_field=type&sort=-timestamp<number>&limit=1
    

    The key point here is that quotes have been removed because they're strings. As a result, when Cloudant sees this, sort is no longer a JSON string enclosed by quotes - it is a normal string! It needs to be JSON as the sort parameter can also be a JSON array if you're sorting multiple fields.

    The solution is to simply add escaped quotes inside the sort property:

    var query = {
        q: "*:*",
        group_field: "type",
        sort: "\"-timestamp<number>\"",
        limit: 1
    };
    

    Now it will appear like this when sent to the server:

    &sort="-timestamp<number>"