Search code examples
google-cloud-platformgoogle-cloud-spanner

Cloud spanner best practice INTERLEAVE questions


Let's take the tables define in docs:

CREATE TABLE Singers (
  SingerId   INT64 NOT NULL,
  FirstName  STRING(1024),
  LastName   STRING(1024),
  SingerInfo BYTES(MAX),
) PRIMARY KEY (SingerId);

CREATE TABLE Albums (
  SingerId     INT64 NOT NULL,
  AlbumId      INT64 NOT NULL,
  AlbumTitle   STRING(MAX),
) PRIMARY KEY (SingerId, AlbumId),
  INTERLEAVE IN PARENT Singers ON DELETE CASCADE;

CREATE TABLE Songs (
  SingerId     INT64 NOT NULL,
  AlbumId      INT64 NOT NULL,
  TrackId      INT64 NOT NULL,
  SongName     STRING(MAX),
) PRIMARY KEY (SingerId, AlbumId, TrackId),
  INTERLEAVE IN PARENT Albums ON DELETE CASCADE;

So we have 3 tables Singers, Albums and Songs. Table Album INTERLEAVE Singers and table Songs INTERLEAVE Singers and Albums.

My question is if we want so search all the information about a particular singer can we search in table Songs if the singer have an album but don't have any song yet ? If not what's the best practice to retrive all the data of a singer(all albums and songs(if he have any)). I thought to search in table Songs if we don't find anything in Songs(because the singer can have an album but songs are in development) search in table Album and after in Singer(because even Album can be in development) but I don't think it's the best solution.

In my case the user who make the query doesn't know if singer have any songs or albums but want to retrive all the info about singer(in one split if possible).


Solution

  • I have come to two solutions:

    1. In this case we have 3 table scan: Singers, Albums, Songs.

      select singers.singerId, albums.albumId, songs.trackId
      from singers
      left join albums ON singers.singerId = albums.singerid
      left join songs ON albums.albumid = songs.albumid

    2. Have one table like:

    Table Schema:

    CREATE TABLE Singers (
      SingerId   INT64 NOT NULL,
      AlbumId INT64,
      SongId INT64,
      .
      .(informations about Singer, Album and Song)
      .
    ) PRIMARY KEY (SingerId);
    

    So we'll have something like:

    SingerId AlbumId  SongId  SingerName AlbumName SongName
       1                        Singer 1
       1        1                          Album 1
       1        1       1                           Song 1
       1        1       2                           Song 2
       1        1       3                           Song 3
       1        1                          Album 2
       1        2       1                           Song 1
       1        2       2                           Song 2
       1        2       3                           Song 3
    

    And with 1 query we can recive all data about Singer.(We have 1 big table scan not 3, but I don't know if it's best practive because again, the server will split data between servers so we'll end with multiple selects between splits).

    What solution you think works the best and if you have something that I miss please explain.