Search code examples
google-cloud-platformgoogle-bigquerypartitioning

Do views of tables in BigQuery benefit from partitioning/clustering optimization?


We have a few tables in BigQuery that are being updated nightly, and then we have a deduplication process doing garbage collection slowly.

To ensure that our UI is always showing the latest, we have a view setup for each table that simply does a SELECT WHERE on the newest timestamp record_id combination

We're about to setup partitioning and clustering to optimize query scope/speed and I couldn't find a clear answer in Google documentation on whether the view of that table will still have partitioned queries or it will end up querying all data.

Alternatively when we create the view, can we include the partition and cluster on in the query that builds the view?


Solution

  • If you're talking about a logical view, then yes if the base table it references is clustered/partitioned it will use those features if they're referenced from the WHERE clause. The logical view doesn't have its own managed storage, it's just effectively a SQL subquery that gets run whenever the view is referenced.

    If you're talking about a materialized view, then partitioning/clustering from the base table isn't inherited, but can be defined on the materialized view. See the DDL syntax for more details: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_materialized_view_statement