I have a record which is a dictionary of performance sampling at a specific revision of our source code. I am storing this in our eve database. We do this performance test for every revision. We have over 20,000 revisions.
I can get the values using http://host/api/performance?projection={"FileIO.Reads":1,"Revision":1}, which gives me 20,000 records with the following:
{
"_items" : [
{ "_id" : ... ,
"_updated": ...,
"_created":...,
"_etag":...,
"Revision":1000,
"FileIO" : {
{ "Reads": [20.34,10,30] } # avg/min/max
}
},
# next item
{ "_id" : ... ,
"_updated": ...,
"_created":...,
"_etag":...,
"Revision":1001,
"FileIO" : {
{ "Reads": [23,10,50] } # avg/min/max
}
}
# and so on
]
}
Is there some way to ask Eve, or even better MongoDB, to group all of these into a single value of the form of [ [Revision, Reads], [Revision, Reads]... ]
or even [Revision, Avg, Min, Max]
to minimize the JSON conversion, performance and bandwidth cost?
Should I do my own processing in the event hooks? If so, in what way?
I think I should be able to do this with aggregation of some type but it isn't clear how to merge my revision with my FileIO Reads.
I don't really have any other ideas how to store this data - we just have a dictionary of performance values per revision.
I did some sleuthing and mucking about and came up with the following aggregation pipeline. I don't know if it is efficient but it does what I need it to do. I guess I kind-of understand how it works but the double grouping seems like it should be unnecessary.
db.getCollection('test_profiles').aggregate( [
{ $group: {
_id : { revision :"$revision", value : "$FileIO.Reads" }
}},
{ $unwind : "$_id"},
{ $group: {
_id : null,
values:
{ $push: "$_id" }
}}
])
This yields the following kind of record:
{
"_id" : null,
"values" : [
{
"revision" : 109999,
"value" : [
0.903873742,
0.00723229861,
1.23190153
]
},
{
"revision" : 109998,
"value" : [
0.903873742,
0.00723229861,
1.23190153
]
},
// .. and on and on
]
}