Search code examples
google-bigqueryuat

How to Compare 2 tables for UAT Testing in BigQuery?


I am looking for some help in regards to how I go about comparing 2 tables in BigQuery for UAT Testing.

Currently I have 6 columns in both tables and they both have the same column names in the same positions.

Create Table Scripts:

CREATE TABLE `project.mydataset.table_1` (
  `ADDRESS_ID` STRING,
  `ORDER_NO` STRING,
  `START_DATE` STRING,
  `END_DATE` STRING,
  `JOB_DETAILS` STRING,
  `LOAD_DATE` STRING
);

CREATE TABLE `project.mydataset.table_2` (
      `ADDRESS_ID` STRING,
      `ORDER_NO` STRING,
      `START_DATE` STRING,
      `END_DATE` STRING,
      `JOB_DETAILS` STRING,
      `LOAD_DATE` STRING
    );

The following is an example of what the table looks like:

ADDRESS_ID | ORDER_NO | START_DATE | END_DATE | JOB_DETAILS | LOAD_DATE
000045 | AE12345 | 2019-04-03 00:00:00 | 2019-04-21 17:14:11 | With Engineer | 2019-04-02 00:00:00

Now what I would like to do, is to go through various checks that will give me stats for the differences between the 2 tables.

One of the first queries I have written was to check the volume I have in both tables for the 01/04/22 for example and this gave me the following volumes:

TABLE_1 = 4317
TABLE_2 = 4431

The next one was to check the duplicates in the 'ADDRESS_ID' column and this gave me the following:

 TABLE_1 = 67
 TABLE_2 = 61

But rather than writing individual queries, is there something that I can write that will give me a load of stats or differences i.e. that will check the 2 tables row by row with a join on the address key or something and then compare and then tell me the difference across the whole table?

Apologies I don't have anymore sample data, as the data is from my workplace.


Solution

  • Take a look at Google's Data Validation Tool. It is pretty decent at running tests between two tables of varying sources.