Search code examples
ruby-on-railsarraysjsonpostgresqljsonb

Query on Postgres JSON array field in Rails


I am trying to query a certain value in a Postgres database. I have a field named groups in the users table that can be represented in either of these ways:

1.

groups: {"data"=>[{"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}, {"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}]}

2.

groups: [{"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}, {"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}]

I am fine with either of this representations. However I just can't seem to find out how to get all the users that are in serie 5 let's say. I tried multiple queries along the lines of:

@users = User.where("groups ->> 'data' @>  ?", {serie: 5})
@users = User.where("groups -> 'data' @>  '?'", {serie: 5})
@users = User.where("groups ->> 'data' ->> 'serie' = ?", 5)

And many other attempts, some more stupid than others (see above). How would I do it?

I have been able to determine that:

select groups -> 'data' ->> 'serie' from users;  
ERROR: cannot extract field from a non-object.

However the following query works:

select json_array_elements(groups -> 'data') ->> 'serie' from users;

I think I am not properly delivering the data in the column. The hash I am providing to create is:

pry(#<Overrides::RegistrationsController>)> @response['data']['user']
=> {"last_name"=>"Doe1",
 "first_name"=>"John1",
 "email"=>"[email protected]",
 "groups"=>
  {"data"=>
    [{"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}, {"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}]}}

Before saving the resource looks like this:

pry(#<Overrides::RegistrationsController>)> @resource
=> #<User id: nil, provider: "email", uid: "", first_name: "John1", last_name: "Doe1", email: "[email protected]", role: "Student", created_at: nil, updated_at: nil, groups: {"data"=>[{"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}, {"serie"=>5, "year"=>3, "specialization"=>"Matematica", "management_id"=>1, "group_number"=>2}]}>

Solution

  • Assumptions

    • Postgres 9.4 or later.
    • "get all the users that are in serie 5" is supposed to mean:
      "with at least one array element that contains {"serie": 5}. There may be others."
    • Working with your first, shorter data format. No redundant 'data' key.

    Short answer

    Use jsonb instead of json and this just works:

    User.where("groups @> ?", '[{"serie": 5}]')
    

    Note the square brackets to make the right-hand operand a JSON array.

    Why?

    The prominent misunderstanding here: data type json is not the same as jsonb.

    You didn't declare the actual table definition, but you later commented json and there is a hint in the question:

    select json_array_elements(groups -> 'data') ->> 'serie' from users;
    

    json_array_elements() only works for json, would have to be jsonb_array_elements() for jsonb. But you try to use the jsonb operator @> which is not defined for json:

    groups -> 'data' @>  '?'
    

    The operator -> returns the same type as the left-hand input. But @> is only defined for jsonb, not for json.

    Then you try to use the operator @> for text as left-hand operand. Not possible either:

    groups ->> 'data' @>  ?
    

    There are variants of the operator @> for various types (incl. Postgres arrays), but not for text and not for json.

    So the answer is: Use jsonb instead of json. This allows to use very efficient indexes, too:

    SQL Demos

    json

    For data type json you could use:

    SELECT *
    FROM   users u
    WHERE  EXISTS (
       SELECT FROM json_array_elements(u.groups) elem 
       WHERE  elem ->> 'serie' = '5'
       );
    

    jsonb

    After altering the column type to jsonb:

    ALTER TABLE users ALTER column groups TYPE jsonb;
    

    Simply:

    SELECT *
    FROM   users
    WHERE  groups @> '[{"serie": 5}]';
    

    fiddle