Search code examples
ruby-on-railsrubypostgresqlactiverecordjsonb

Building Activerecord / SQL query for jsonb value search


Currently, for a recurring search with different parameters, I have this ActiveRecord query built:

current_user.documents.order(:updated_at).reverse_order.includes(:groups,:rules)

Now, usually I tack on a where clause to this to perform this search. However, I now need to do a search through the jsonb field for all rows that have a certain value as in the key:value pair. I've been able to do something a similar to that in my SQL, with this syntax (the data field will only be exactly two levels nested):

SELECT 
    * 
FROM 
    (SELECT 
         * 
    FROM 
        (SELECT 
            * 
        FROM 
            documents
        ) A, 
        jsonb_each(A.data)
    ) B, 
    jsonb_each_text(B.value) ASC C 
WHERE 
    C.value = '30';

However, I want to use the current ActiveRecord search to make this query (which includes the groups/rules eager loading).

I'm struggling with the use of the comma, which I understand is an implicit join, which is executed before explicit joins, so when I try something like this:

select * from documents B join (select * from jsonb_each(B.data)) as A on true;
ERROR:  invalid reference to FROM-clause entry for table "b"
LINE 1: ...* from documents B join (select * from jsonb_each(B.data)) a...
                                                           ^
HINT: There is an entry for table "b", but it cannot be referenced from this part of the query.

But I don't understand how to reference the complete "table" the ActiveRecord query I have creates before I make a joins call, as well as make use of the comma syntax for implicit joins to work.

Also, I'm an SQL amateur, so if you see some improvements or other ways to do this, please do tell.

EDIT: Description of documents table:

enter image description here

                                                        Table "public.documents"
Column      |            Type             |                       Modifiers                        | Storage  | Stats target | Description 
------------+-----------------------------+--------------------------------------------------------+----------+--------------+-------------
id          | integer                     | not null default nextval('documents_id_seq'::regclass) | plain    |              | 
document_id | character varying           |                                                        | extended |              | 
name        | character varying           |                                                        | extended |              | 
size        | integer                     |                                                        | plain    |              | 
last_updated| timestamp without time zone |                                                        | plain    |              | 
user_id     | integer                     |                                                        | plain    |              | 
created_at  | timestamp without time zone |                                                        | plain    |              | 
updated_at  | timestamp without time zone |                                                        | plain    |              | 
kind        | character varying           |                                                        | extended |              | 
uid         | character varying           |                                                        | extended |              | 
access_token_id | integer                     |                                                        | plain    |              | 
data        | jsonb                       | not null default '{}'::jsonb                           | extended |              | 

Indexes: "documents_pkey" PRIMARY KEY, btree (id) ```

Sample rows, first would match a search for '30' (data is the last field):

2104 | 24419693037                                          | LsitHandsBackwards.jpg                   |        |                         |       1 | 2017-06-25 21:45:49.121686 | 2017-07-01 21:32:37.624184 | box          | 221607127         |              15 | {"owner": {"born": "to make history", "price": 30}}
2177 | /all-drive/uml flows/typicaluseractivity.svg         | TypicalUserActivity.svg                  |  12375 | 2014-08-11 02:21:14     |       1 | 2017-07-07 14:00:11.487455 | 2017-07-07 14:00:11.487455 | dropbox      | 325694961         |              20 | {"owner": {}}

Solution

  • You can use a query similar to the one you already showed:

    SELECT
        d.id, d.data
    FROM
        documents AS d 
        INNER JOIN json_each(d.data)  AS x ON TRUE 
        INNER JOIN json_each(x.value) AS y ON TRUE 
    WHERE
        cast(y.value as text) = '30';
    

    Assuming your data would be the following one:

    INSERT INTO documents
        (data)
    VALUES
       ('{"owner": {"born": "to make history", "price": 30}}'),
       ('{"owner": {}}'),
       ('{"owner": {"born": "to make history", "price": 50}, "seller": {"worth": 30}}')
    ;
    

    The result you'd get is:

    id | data                                                                        
    -: | :---------------------------------------------------------------------------
     1 | {"owner": {"born": "to make history", "price": 30}}                         
     3 | {"owner": {"born": "to make history", "price": 50}, "seller": {"worth": 30}}
    

    You can check it (together with some step-by-step looks at the data) at dbfiddle here