Search code examples
jsondatabasepostgresqlnosqldatabase-schema

Need advice on database schema for storing complex json


I cant decide whether to use postgres, cassandra or another nosql

the json collection is like this

[{
id:
name: 
year: 
credits: [{id: "", name: "", role: ""}...]
genres: [{id: "", name: ""}...]
seasons: [
         { id:
           name:
           season_number:
           episodes: [
                     {id: "", 
                      name: "", 
                      season_number: "", 
                      episode_number: ""}, 
                      ...] 
         }, 
         ...]
},
...]

from above, the collection itself is an array of objects. each object has four nested arrays for values, namely keys credits, genres, seasons, episodes.

the full object needs to be read every time because all the fields are needed to be shown on the frontend.

the items in the episodes key will be inserted or deleted the most.
they are grouped, split into seasons which is also an array

initial sql table schema was one row per episode, but that creates a lot of json redundancy

its important to have a query that returns the above object in its original structure


Solution

  • I concur with Laurenz' answer that PostgreSQL would be a better option because of the update/deletes. You can store the data in standard SQL tables and use PostgreSQL JSON support to generate the necessary JSON.

    To demonstrate I took your example and created a working example in db_fiddle which you can access here https://www.db-fiddle.com/f/81Acp5wdEEaWMMxKfvbUaS/3.

    create schema test;
    create table test.tvshow ( id serial, name text );
    create table test.credit ( id serial, show_id integer, name text , role text );
    create table test.genre (id serial, name text, show_id integer );
    create table test.season ( id serial, name text, season_number text, show_id integer);
    create table test.episode ( id serial, name text, season_id integer, episode_number text, show_id integer);
    
    insert into test.tvshow (name) values ('gunsmoke');
    insert into test.credit (name, show_id) values ('credits #1', 1);
    insert into test.credit (name, show_id) values ('credits #2', 1);
    insert into test.credit (name, show_id) values ('credits #3', 1);
    
    insert into test.genre (name, show_id) values ('western', 1);
    insert into test.genre (name, show_id) values ('drama', 1);
    
    insert into test.season (name, show_id, season_number) values ('Season 1', 1, '01');
    insert into test.season (name, show_id, season_number) values ('Season 2', 1, '02');
    
    
    insert into test.episode (name, season_id, episode_number, show_id) values ('Episode#1',1, '01', 1);
    insert into test.episode (name, season_id, episode_number, show_id) values ('Episode#2',1, '01', 1);
    insert into test.episode (name, season_id, episode_number, show_id) values ('Episode#3',1, '01', 1);
    
    insert into test.episode (name, season_id, episode_number, show_id) values ('Episode#1',2, '01', 1);
    insert into test.episode (name, season_id, episode_number, show_id) values ('Episode#1',2, '02', 1);
    

    Once populated you can execute the following SQL.

      SELECT array_to_json(array_agg(row_to_json(t, true)))
    FROM (
        SELECT name
            ,(
                SELECT array_to_json(array_agg(row_to_json(c)))
                FROM (
                    SELECT id
                        ,name
                        ,ROLE
                    FROM test.credit
                    WHERE test.credit.show_id = test.tvshow.id
                    ) c
                ) AS credits
            ,
            --
            (
                SELECT array_to_json(array_agg(row_to_json(g)))
                FROM (
                    SELECT id
                        ,name
                    FROM test.genre
                    WHERE test.genre.show_id = test.tvshow.id
                    ) g
                ) AS genres
            ,
            --
            (
                SELECT array_to_json(array_agg(row_to_json(s, true)))
                FROM (
                    SELECT id
                        ,name
                        ,season_number
                        ,(
                            SELECT array_to_json(array_agg(row_to_json(e)))
                            FROM (
                                SELECT id
                                    ,name
                                    ,season_number
                                FROM test.episode
                                WHERE test.episode.show_id = test.tvshow.id
                                    AND test.episode.season_id = test.season.id
                                ) e
                            ) AS episodes
                    FROM test.season
                    WHERE test.season.show_id = test.tvshow.id
                    ) s
                ) AS seasons
        --
        FROM test.tvshow
        ) t
    

    That will return a with the JSON row for all the shows.