Search code examples
postgresqlgoogle-bigquerydynamic-sql

BigQuery Script to replicate google cloud postgres tables to BigQuery with federated queries


I have a google cloud postgres instance and I'm trying to replicate all the tables from a schema into separate google bigquery tables with a script running federated queries that will be scheduled to run at some interval. I'm using data from information_schema since I would like the script to be able to pick newly added tables as well.

Lets say the analytics schema has 4 tables.

analytics.table1
analytics.table2
analytics.table3
analytics.table4

My example code below generates a list of all tables in the analytics schema and sets it to variable name "table_name".

DECLARE table_name ARRAY<STRING>;

SET table_name=(
SELECT * FROM EXTERNAL_QUERY("<your_connection_id>",
    "select table_schema||'.'||table_name as table_name from information_schema.tables 
        where table_schema='analytics';");
);

My intent for this script is to create four separate bigquery tables with the queries below

CREATE TABLE analytics.table1 AS
SELECT * FROM EXTERNAL_QUERY("<your_connection_id>", "SELECT * FROM analytics.table1");

CREATE TABLE analytics.table2 AS
SELECT * FROM EXTERNAL_QUERY("<your_connection_id>", "SELECT * FROM analytics.table2");

CREATE TABLE analytics.table3 AS
SELECT * FROM EXTERNAL_QUERY("<your_connection_id>", "SELECT * FROM analytics.table3");

CREATE TABLE analytics.table4 AS
SELECT * FROM EXTERNAL_QUERY("<your_connection_id>", "SELECT * FROM analytics.table4");

What is the best way to loop through my list of tables "table_name" to create these tables with separate sql statements?


Solution

  • You can try this solution:

    DECLARE TABLE_LIST ARRAY<STRING>;
    DECLARE DSQL STRING;
    DECLARE i INT64 DEFAULT 1;
    DECLARE cnt INT64 DEFAULT 0;
    SET TABLE_LIST = ARRAY(SELECT * FROM EXTERNAL_QUERY("connection-id", "SELECT table_name FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'analytics';"));
    SET cnt = ARRAY_LENGTH(TABLE_LIST);
    WHILE i <= cnt 
    DO
      SET DSQL = '"SELECT * FROM analytics.' || TABLE_LIST[ORDINAL(i)] || '"';
      EXECUTE IMMEDIATE 'CREATE TABLE analytics.' || TABLE_LIST[ORDINAL(i)] || ' AS SELECT * FROM EXTERNAL_QUERY("connection-id",' || DSQL || ');';
      SET i = i + 1;
    END WHILE;