Search code examples
sqlgoogle-bigqueryreferencesubqueryalias

How can your current SQL query refer back to a previous query without writing aliases in the query?


I am taking the Google Data Analytics course on Coursera and in the video the instructor executed the following query:

SELECT
  Date,
  Region,
  Small_Bags,
  Large_Bags,
  XLarge_Bags,
  Total_Bags,
  Small_Bags + Large_Bags + XLarge_Bags AS Total_Bags_Calc
FROM
  `class-5-355317.avocado_data.avocado_prices`

After executing this query they then opened a different editor window on BigQuery and executed the following query referring to an alias in the query above without defineing it:

SELECT
  *
FROM 
  `class-5-355317.avocado_data.avocado_prices`
WHERE
  Total_Bags != Total_Bags_Calc

When I executed this query it did not work for me and I received this error: 'Unrecognized name: Total_Bags_Calc; Did you mean Total_Bags?'

This makes sense. Within this query, the alias 'Total_Bags_Calc' hadn't been used within that query and didn't have anything to pull, so I tried a workaround:

SELECT
  Date,
  Region,
  Small_Bags,
  Large_Bags,
  XLarge_Bags,
  Total_Bags,
  (SELECT Small_Bags + Large_Bags + XLarge_Bags FROM `class-5355317.avocado_data.avocado_prices`) AS Total_Bags_Calc
FROM `class-5-355317.avocado_data.avocado_prices`
WHERE
  Total_Bags != Total_Bags_Calc

From what I understood this should work since the subquery now held the alias 'Total_Bags_Calc' but I still received the error Unrecognized name: Total_Bags_Calc; Did you mean Total_Bags?

How can I make this query work, and is there any way to have a query reference another query in the same manner that theirs did in the example?


Solution

  • You'll want to select FROM the result of your first query, so try moving that subquery into the FROM clause.

    For example,

    SELECT
      *
    FROM 
      {{ your other query goes here }}
    WHERE
      Total_Bags != Total_Bags_Calc
    

    Which would be:

    SELECT
      *
    FROM
    (SELECT
      Date,
      Region,
      Small_Bags,
      Large_Bags,
      XLarge_Bags,
      Total_Bags,
      Small_Bags + Large_Bags + XLarge_Bags AS Total_Bags_Calc
    FROM
      `class-5-355317.avocado_data.avocado_prices`
    ) as subquery
    WHERE
      Total_Bags != Total_Bags_Calc
    

    This is a really helpful technique to learn, so definitely learn it. However since you're doing something rather simple, you can actually get away with just coding that logic in your WHERE clause.

    SELECT
      Date,
      Region,
      Small_Bags,
      Large_Bags,
      XLarge_Bags,
      Total_Bags,
      Small_Bags + Large_Bags + XLarge_Bags AS Total_Bags_Calc
    FROM
      `class-5-355317.avocado_data.avocado_prices`
    WHERE (Small_Bags + Large_Bags + XLarge_Bags) <> Total_Bags