Search code examples
ruby-on-railsdatabasepostgresqlruby-on-rails-6

Query a jsonb column to get sum of innermost values by outermost key in rails


I have a model named organization request having many columns like service, country, request type, certificate type. I want to save organization request count in a separate table named Statistics. Both tables are in Postgresql. Everynight through a rake task, I will save whole day's count in statistics table in json format. Column name is daily_stats and type is jsonb. Purpose of the statistics table is to keep count for long time. Since after a certain period, organization requests will be deleted. So in that case Statistics will be useful. Example of a daily_stats record:

{"tink": {"sweden": {"api": {"consumer": {"organization_1": 53, "organization_2": 77}, "corporate": {"organization_1": 80, "organization_2": 38}}, "iframe": {"consumer": {"organization_1": 21, "organization_2": 51}, "corporate": {"organization_1": 41, "organization_2": 11}}, "generic_link": {"consumer": {"organization_1": 83, "organization_2": 25}, "corporate": {"organization_1": 84, "organization_2": 45}}, "manual_request": {"consumer": {"organization_1": 64, "organization_2": 97}, "corporate": {"organization_1": 39, "organization_2": 44}}}, "finland": {"api": {"consumer": {"organization_1": 20, "organization_2": 71}, "corporate": {"organization_1": 76, "organization_2": 59}}, "iframe": {"consumer": {"organization_1": 48, "organization_2": 79}, "corporate": {"organization_1": 74, "organization_2": 17}}, "generic_link": {"consumer": {"organization_1": 25, "organization_2": 30}, "corporate": {"organization_1": 52, "organization_2": 3}}, "manual_request": {"consumer": {"organization_1": 2, "organization_2": 2}, "corporate": {"organization_1": 4, "organization_2": 76}}}}, "enable_banking": {"sweden": {"api": {"consumer": {"organization_1": 17, "organization_2": 30}, "corporate": {"organization_1": 89, "organization_2": 23}}, "iframe": {"consumer": {"organization_1": 57, "organization_2": 21}, "corporate": {"organization_1": 9, "organization_2": 55}}, "generic_link": {"consumer": {"organization_1": 18, "organization_2": 86}, "corporate": {"organization_1": 67, "organization_2": 5}}, "manual_request": {"consumer": {"organization_1": 2, "organization_2": 46}, "corporate": {"organization_1": 43, "organization_2": 88}}}, "finland": {"api": {"consumer": {"organization_1": 55, "organization_2": 22}, "corporate": {"organization_1": 38, "organization_2": 54}}, "iframe": {"consumer": {"organization_1": 84, "organization_2": 62}, "corporate": {"organization_1": 51, "organization_2": 60}}, "generic_link": {"consumer": {"organization_1": 74, "organization_2": 84}, "corporate": {"organization_1": 63, "organization_2": 95}}, "manual_request": {"consumer": {"organization_1": 37, "organization_2": 80}, "corporate": {"organization_1": 34, "organization_2": 31}}}}}

I want to fetch record from it using queries like active record. I have tried using map but that gets too complicated and also using map on large number of records will affect performance. I searched a lot but could not find a link to extract data from jsonb type as per different conditions.

Yesterday, I was able to get sum from below postgresql query:

select sum(stats.val::numeric) as total
from(
    select key, sum(value::numeric) val
    from statistics t,
    jsonb_each_text(daily_stats->'tink'->'sweden'->'api'->'consumer')
    group by key) as stats;

Here I can get total of all values at given jsonpath of all records of daily_stats column but still, I am not able to extract values using only single key, i.e. "tink". I want a query which gives me sum of all values sitting at innermost keys 'organization_1' and 'organization_2' of 'tink' key. Can anybody help?


Solution

  • This query solved my problem:

    WITH new_table AS (
        select 
           s.service service,
           m.market market, 
           rt.request_type request_type,
           ct.certificate_type certificate_type,
           stats.organization organization,
           (stats.req_count::text)::int req_count
        from statistics b
          cross join json_each(daily_stats::json) as s(service, service_hash)
          cross join json_each(service_hash::json) as m(market, market_hash)
          cross join json_each(market_hash::json) as rt(request_type, request_type_hash)
          cross join json_each(request_type_hash::json) as ct(certificate_type, certificate_type_hash)
          cross join json_each(certificate_type_hash::json) as stats(organization, req_count)
    )     
    
    select SUM(req_count) from new_table where service = 'tink' and market = 'sweden' and request_type = 'api' and certificate_type = 'corporate';
    

    Here in last SELECT statement, by changing arguments in where clause, we can get required results from statistics table.