Search code examples
google-cloud-platformgoogle-bigquerygoogle-cloud-datalab

How to query BigQuery view through Datalab?


I have a SQL View Saved in my BigQuery. I wanted to run SQL queries on this View. WE can run SQL Query on Tables as follow :

%%bq query
SELECT * 
FROM tablename
WHERE condition

But I am getting error while running the same for View:

%%bq query
SELECT * FROM viewname

Error:

invalidQuery: Cannot reference a legacy SQL view in a standard SQL query.

OR

import google.datalab.bigquery as bq
c = bq.View.query(sql='SELECT * From viewname').execute().result().to_dataframe()    

Error:

NameErrorTraceback (most recent call last) in () 1 import google.datalab.bigquery as bq ----> 2 c = View.query(sql='SELECT * From viewname').execute().result().to_dataframe()
TypeError: 'property' object is not callable


Solution

  • As you have already identified there are two ways of working with BigQuery in Datalab:

    • Using %%bq, as presented in this sample notebook.
    • Using Datalab's BigQuery API libraries (sample notebook). Currently, the supported and updated version is google.datalab.bigquery, however the legacy version datalab.bigquery is still available for backwards compatibility reasons.

    As per the error message that you are getting ( Cannot reference a legacy SQL view in a standard SQL query ), I understand that the view you are querying was created with a Legacy SQL query. As explained in this post, Datalab is moving to Standard SQL, and therefore its support for Legacy SQL is being discontinued. That is the reason why both %%bq and google.datalab.bigquery do not support Legacy SQL, and only work with Standard SQL.

    If you really need the view as it is right now (i.e. created with Legacy SQL), as a temporary workaround, you can use the old library instead (which is still available), datalab.bigquery as below:

    import datalab.bigquery as bq
    bq.Query("SELECT * FROM dataset.legacy_sql_view;").results()
    

    However, I would strongly recommend that you start migrating to Standard SQL, given that it is the preferred query language for BigQuery, and it has multiple advantages, among which you can count the official support in Datalab.