Search code examples
mongodbelasticsearchelasticsearch-mongo-river

elasticsearch-mongo-river How to change data field


I use mongodb to store login data as a collection, for example:

// userA login module mA:

{
  "_id" : ObjectId("uuid 1"),
  "user" : "userA",
  "module" : "mA",
  "collectDate" : ISODate("2013-03-18T08:25:11.601Z")
}

// userA also login module mB:

{
  "_id" : ObjectId("uuid 2"),
  "user" : "userA",
  "module" : "mB",
  "collectDate" : ISODate("2013-03-18T08:34:15.106Z")
}

// userB login module mB:

{
  "_id" : ObjectId("uuid 3"),
  "user" : "userB",
  "module" : "mB",
  "collectDate" : ISODate("2013-03-18T08:34:15.106Z")
}

And then use the map-reduce framework to count how many time per user per day The results look like: // the count is 2, since userA login mA and mB

{
  "_id" : {
    "user" : "userA",
    "date" : ISODate("2013-03-18")
  },
  "value" : {
    "count" : 2.0
  }
}

// the count is 1, since userB login mB

{
  "_id" : {
    "user" : "userB",
    "date" : ISODate("2013-03-18")
  },
  "value" : {
    "count" : 1.0
  }
}

The question is how to move the reduced results to Elastic search? I want the data in ES show as:

{
  "user" : "userA",
  "date" : ISODate("2013-03-18")
  "count" : 2
},
{
  "user" : "userB",
  "date" : ISODate("2013-03-18")
  "count" : 1
}

How to change the nested data?

I use the following command:

curl -XPUT "es.server.com:9200/_river/Entity/_meta" -d '
{
  "type":"mongodb",
  "mongodb":{
    "servers":[{"host":"mongodb.server.com","port":27017 }],
    "options":{"secondary_read_preference":true},
    "db":"udc","collection":"Entity"
  },
  "index":{
    "name":"udc","type":"Entity"
  }
}'

But the imported data doesn't meet my requirement as shown below:

{
  _index: udc
  _type: user
  _id: { 
    "user" : "userA" , 
    "date" : { 
      "$date" : "2013-06-22T07:00:00.000Z"
    }
  }
  _version: 1
  _score: null
  _source: {
    _id: {
      date: 2013-06-22T07:00:00.000Z
      user: userA
    }
    value: {
      count: 18
    }
  }
  sort: [
    user
  ]
}

Solution

  • Take a look at scripts for river.

    What you want is something like

    ctx.document.user= ctx.document._id.user;...;delete ctx.document._id; delete ctx.document.value;