In BigQuery, I have several tables with the same name prefix. The suffix of each table is a month, written in format yyyy_mm like 2023_09.
These tables are created by Google Datastream, copied over from PostgreSQL partitioned table.
For the previous months I have used the following query to create a view to combine these tables
SELECT
*,
REPLACE(_TABLE_SUFFIX, '_', '-') AS month,
FROM
`my_project_id.my_dataset.my_table_*`
WHERE
REGEXP_CONTAINS(_TABLE_SUFFIX, '^[0-9]{4}_[0-9]{2}$')
It worked well, but recently I encountered a new error in BigQuery which stops me from querying the view.
Tables with unapplied upsert data cannot be queried through prefix. First table with upsert data is my_project_id:my_dataset.my_table_2022_09.
I don't know how to fix this or what workaround can be applied, except writing all tables explicitly in the query.
I've raised a Google Support Case and received an answer that this is a known issue, but the current error message is not very accurate. There will be a support for this feature.
The suggested workaround is to use the ‘UNION’, the easiest way to query all tables.
I created the view and added a fixed value column to each table. Filtering by this column reduces the scanned bytes.
CREATE VIEW some.view as (
SELECT *, '2023-01' as month,
FROM `my_project_id.my_dataset.my_table_2023_01`
UNION ALL
SELECT *, '2023-02' as month,
FROM `my_project_id.my_dataset.my_table_2023_02`
UNION ALL
SELECT *, '2023-03' as month,
FROM `my_project_id.my_dataset.my_table_2023_03`
);
SELECT * FROM some.view WHERE month = '2023-03'