Search code examples
sqljsonpostgresqlgreatest-n-per-group

Postgres find unique values ​in json


I am using Postgresql and have a table, with id, sender::jsonb and date, as follows:

id |                                      sender                                        | last login date                 
----+-----------------------------------------------------------------------------------+----------------------------------
  1 | {"firstName": "Ivan",    "lastName": "Ivanov",       "middleName": "Ivanovich", } | 2021-04-10 14:49:36.234504 +00:00   
  2 | {"firstName": "Ivan",    "lastName": "Ivanov",       "middleName": "Ivanovich", } | 2021-04-09 14:49:36.234504 +00:00 
  3 | {"firstName": "Ivan",    "lastName": "Ivanov",       "middleName": "Ivanovich", } | 2021-04-11 14:49:36.234504 +00:00 
  4 | {"firstName": "Nickolai","lastName": "Nickov",       "middleName": "Nikovich", }  | 2021-04-30 14:49:36.234504 +00:00 
  5 | {"firstName": "Nickolai","lastName": "Nickov",       "middleName": "Nikovich", }  | 2021-04-29 14:49:36.234504 +00:00 
  6 | {"firstName": "Vladimir","lastName": "Vladimirovich","middleName": "Putout", }    | 2021-04-15 14:49:36.234504 +00:00 
  7 | {"firstName": "Petr",    "lastName": "Petrov",       "middleName": "Petrovich", } | 2021-04-10 14:49:36.234504 +00:00 
  8 | {"firstName": "Ivan",    "lastName": "Ivanov",       "middleName": "Ivanovich", } | 2021-04-01 14:49:36.234504 +00:00 
  9 | {"firstName": "Ignat",   "lastName": "Ignatov",      "middleName": "Ignatovich", }| 2021-04-06 14:49:36.234504 +00:00 
  10| {"firstName": "Vladimir","lastName": "Vladimirovich","middleName": "Putout", }    | 2021-04-17 14:49:36.234504 +00:00 
  11| {"firstName": "Ivan",    "lastName": "Ivanov",       "middleName": "Ivanovich", } | 2021-04-12 14:49:36.234504 +00:00 

p.s.There may be other information in the "sender" column, but the search for uniqueness is only necessary by "firstName", "lastName", "midddleName"

It is necessary to return a result consisting of unique names and with the latest date. In particular, I want to get the result:

id |                                      sender                                        | last login date                 
----+-----------------------------------------------------------------------------------+----------------------------------
  4 | {"firstName": "Nickolai","lastName": "Nickov",       "middleName": "Nikovich", }  | 2021-04-30 14:49:36.234504 +00:00 
  10| {"firstName": "Vladimir","lastName": "Vladimirovich","middleName": "Putout", }    | 2021-04-17 14:49:36.234504 +00:00 
  11| {"firstName": "Ivan",    "lastName": "Ivanov",       "middleName": "Ivanovich", } | 2021-04-12 14:49:36.234504 +00:00 
  7 | {"firstName": "Petr",    "lastName": "Petrov",       "middleName": "Petrovich", } | 2021-04-10 14:49:36.234504 +00:00 
  9 | {"firstName": "Ignat",   "lastName": "Ignatov",      "middleName": "Ignatovich", }| 2021-04-06 14:49:36.234504 +00:00 

Everything is very complicated by the fact that json is used. I had thoughts to do - "name" concatenation and perform group by and sorting, but unfortunately it does not work.


Solution

  • You can use distinct on() to do this:

    select distinct on (firstname, lastname) id, sender, last_login_date
    from (
       select id, sender, last_login_date, 
              sender ->> 'firstName' as firstname,
              sender ->> 'lastName' as lastname
       from the_table
    ) t
    order by firstname, lastname, last_login_date desc