Search code examples
sqlasync-awaititeratores6-promisegoogle-cloud-spanner

How to use iterators and Async await to query a google cloud spanner table T1 while querying a second table T2 for each record in T1


I have two tables in Google cloud spanner database - Authors and Books.

const request = { schema: [ CREATE TABLE Authors ( AuthorId INT64 NOT NULL, FirstName STRING(1024), LastName STRING(1024), AuthorInfo BYTES(MAX) ) PRIMARY KEY (AuthorId), CREATE TABLE Books ( AuthorId INT64 NOT NULL, BookId INT64 NOT NULL, BookTitle STRING(MAX) ) PRIMARY KEY (AuthorId, BookId), INTERLEAVE IN PARENT Authors ON DELETE CASCADE, ], };

export async function findAuthorBooks(authorId) {
  // [START spanner_find_author_books]
  const database = instance.database(databaseId);
  const query = {
    sql: "SELECT * FROM Books As t WHERE t.AuthorId = @authorId",
    params: { authorId },
    types: { authorId: "string" },
  };

  const results = await database.run(query);
  const rows = results[0];
  const result = [];
  rows.forEach((row) => {
    const json = row.toJSON();
    result.push(json);
  });
  database.close();
  if (Array.isArray(result)) return result;
  throw new Error("err");
  // [END spanner_find_author_books]
}

Now I want to query all the Authors and for each author, I query all his books and add it like an object. The function findAuthor1() works perfectly fine.

export async function findAuthors1() {
  // [START spanner_find_authors]
  const database = instance.database(databaseId);
  const results = await database.run({ sql: "SELECT * FROM Authors" });
  const rows = results[0];
  const result = [];
  for (const row of rows) {
    const json = row.toJSON();
    const id = json.vendorId;
    json.notification = await findAuthorBooks(id);
    result.push(json);
  }
  database.close();
  if (Array.isArray(result)) return result;
  throw new Error("err");
  // [END spanner_find_authors]
}

But the problem here is, iterators/generators require regenerator-runtime, which is too heavyweight for this guide to allow them. Separately, loops should be avoided in favour of array iterations (according to Eslint). So, I decided to go with Promise.all as shown in function findAuthors2() below:

export async function findAuthors2() {
  // [START spanner_find_authors]
  const database = instance.database(databaseId);
  const results = await database.run({ sql: "SELECT * FROM Authors" });
  const rows = results[0];
  const result = [];

  await Promise.all(rows.map(async (row) => {
    const json = row.toJSON();
    const id = json.vendorId;
    json.notification = await findAuthorBooks(id);
    result.push(json);
  }));
  database.close();
  if (Array.isArray(result)) return result;
  throw new Error("err");
  // [END spanner_find_authors]
}

Unfortunately, it doesn't work. So how can I make it work or is there a different (better) way of doing it without necessarily using Promise.all? Or Is there a way to write a subquery that will select Books as Array of structs and add to the main query selecting Authors?


Solution

  • Or Is there a way to write a subquery that will select Books as Array of structs and add to the main query selecting Authors?

    Yes- using a subquery which returns an Array of Structs: (see Notes about Subqueries in Spanner SQL docs)

    eg:

    SELECT 
      a.AuthorId, 
      a.FirstName, 
      a.LastName, 
      a.AuthorInfo,
      ARRAY(SELECT AS STRUCT
              b.BookID, b.BookTitle
            FROM
              Books b
            WHERE
              a.AuthorId = b.AuthorId) as Books
    FROM
      Authors a;
    

    You will have to unpack the array of structs that is returned as the Books column in your code...

    There is also the simpler method: just join the 2 tables and get a row per author/book combination, and detect in your code when the AuthorID changes

    SELECT 
      a.AuthorId, 
      a.FirstName, 
      a.LastName, 
      a.AuthorInfo,
      b.BookID,
      b.BookTitle
    FROM
      Authors a, Books b
    WHERE
      a.AuthorID = b.BookID
    ORDER BY
      a.AuthorID;