Search code examples
nosqlrethinkdbrethinkdb-ruby

Possible to return multiple results with reduce function?


With the following schema (defined below). I can use map reduce to aggregate the delivered_count field for all days (which is an embedded array inside the campaign document).

  {
    campaign_id: 1,
    status: 'running',
    dates: {
      '20130926' => {
        delivered: 1,
        failed: 1,
        queued: 1,
        clicked: 1,
        males_count: 1,
        females_count: 1,
        pacific_region: { clicked_count: 10 },
        america_region: { clicked_count: 10 },
        atlantic_region: { clicked_count: 10 },
        europe_region: { clicked_count: 10 },
        africa_region: { clicked_count: 10 },
        etc_region: { clicked_count: 10 },
        asia_region: { clicked_count: 10 },
        australia_region: { clicked_count: 10 }
      },
      '20130927' => {
        delivered: 1,
        failed: 1,
        queued: 1,
        clicked: 1,
        males_count: 1,
        females_count: 1,
        pacific_region: { clicked_count: 10 },
        america_region: { clicked_count: 10 },
        atlantic_region: { clicked_count: 10 },
        europe_region: { clicked_count: 10 },
        africa_region: { clicked_count: 10 },
        etc_region: { clicked_count: 10 },
        asia_region: { clicked_count: 10 },
        australia_region: { clicked_count: 10 }
      },
      '20130928' => {
        delivered: 1,
        failed: 1,
        queued: 1,
        clicked: 1,
        males_count: 1,
        females_count: 1,
        pacific_region: { clicked_count: 10 },
        america_region: { clicked_count: 10 },
        atlantic_region: { clicked_count: 10 },
        europe_region: { clicked_count: 10 },
        africa_region: { clicked_count: 10 },
        etc_region: { clicked_count: 10 },
        asia_region: { clicked_count: 10 },
        australia_region: { clicked_count: 10 }
      }
    }
  }

The code below parses through field asia_regions outputs the value of field clicked_count => 30 (combined value of all data)

$rethinkdb.table(:daily_stat_campaigns).filter { |daily_stat_campaign| daily_stat_campaign[:campaign_id].eq 1 }[0][:dates].do { |doc|
  doc.keys.map { |key|
    doc.get_field(key)[:asia_region][:clicked_count].default(0)
  }.reduce { |left, right|
    left+right
  }
}.run

Is it possible to run the code above but against multiple regions? This way I can run one query which will return multiple sums. The output i'm trying to achieve is something similar to the pseudo result below.

[{ asia_region: {clicked_count: 30}}, {america_region: {clicked_count: 30} }]

Solution

  • This seems to work:

    require 'awesome_print' # For better readability on output
    
    regions = [:pacific_region, :america_region]
    reg_clicks = $rethinkdb.table(:daily_stat_campaigns).filter { |daily_stat_campaign| daily_stat_campaign[:campaign_id].eq 1 }[0][:dates].do { |doc|
      doc.keys.concat_map { |key|
        doc
        .get_field(key)
        .pluck(regions)
        .coerce_to("ARRAY")
      }
    }
    ap reg_clicks.run
    

    Will output something like: [["america_region", {"clicked_count"=>10}], ["pacific_region", {"clicked_count"=>10}], ["america_region", {"clicked_count"=>10}], ["pacific_region", {"clicked_count"=>10}], ["america_region", {"clicked_count"=>10}], ["pacific_region", {"clicked_count"=>10}]]

    aggregate = reg_clicks.map { |reg|
      { reg: reg[0], clicked_count: reg[1][:clicked_count] }
    }
    ap aggregate.run
    

    Will output: [{"reg"=>"america_region", "clicked_count"=>10}, {"reg"=>"pacific_region", "clicked_count"=>10}, {"reg"=>"america_region", "clicked_count"=>10}, {"reg"=>"pacific_region", "clicked_count"=>10}, {"reg"=>"america_region", "clicked_count"=>10}, {"reg"=>"pacific_region", "clicked_count"=>10}]

    ap aggregate.group_by(:reg, $rethinkdb_rql.sum(:clicked_count)).run
    

    Outputs: [{"reduction"=>30, "group"=>{"reg"=>"america_region"}}, {"reduction"=>30, "group"=>{"reg"=>"pacific_region"}}]