Search code examples
google-bigqueryfact

is there concept like fact and dimension in bigquery


As we are planning to migrate the data from Teradata to google cloud(Bigquery). In Teradata we have key concepts like primary and foreign with help of this keys we are able to define relation between dimension and fact.

Say for example I have 3 dimension tables and one fact table as shown below.

D1 D2 D3

F1

with the help of keys or indexes in Teradata we can able to fetch the data from fact table.

When coming to Bigquery we do not have any concept like keys or indexes then how we are going to define relation between the dimension and fact

Note: If there are no primary keys or index concept how we are going to eliminate the duplicates


Solution

  • Primary keys, facts, and dimensions are concepts that previous generation data warehouses needed to rely on for performance.

    For example, in Teradata a primary index key is needed to distribute the data between nodes. This distribution will be key to enable fast performing queries later - but in the BigQuery world this type of pre-planned distribution among nodes is not needed.

    Same thing with facts, dimensions, star schemas, OLAP cubes... the main reason for their existence is to enable faster queries later - by restricting what can be queried and among which dimensions. You won't need to worry about this with BigQuery.

    Instead of dividing into its normal form, it makes sense to have a flat table with all the dimensions incorporated in BigQuery. Arbitrary JOINs will be fast too - but flat tables and nested data are easy to handle here.

    Now that you are not restricted by older tech needs - removing duplicates becomes a different type of operation.