Search code examples
pythonoauth-2.0google-bigqueryjupyter-notebookgoogle-oauth

using OAuth2 user account authentication in the python google cloud API from jupyter notebook


I am trying to access BigQuery from python code in Jupyter notebook run on a local machine. So I installed the google cloud API packages on my laptop.

I need to pass the OAuth2 authentication. But unfortunately, I only have user account to our bigquery. I do not have service account and not application credentials, nor do I have the permissions to create such. I am only allowed to work with user account.

When running the bigquery.Client() function, it appears to look for application credentials by looking at an environment variable GOOGLE_APPLICATION_CREDENTIALS. But this, it seems, for my non existing application credentials.

I cannot find any other way to connect using user account authentication. But I find it extremely weird because:

  1. The google API for R language works simply with user authentication. Parallel code in R (it has different API) just works!
  2. I run the code from the dataspell IDE. I have created in the IDE a database resource connection to bigquery (with my user authentication). There I am capable of opening a console for the database and I can run SQL queries in the console with no problem. I have attached the bigquery session to my python notebook, and I can see my notebook attached to the big query session in the services pane. But I am still missing something in order to access some valid running connection in the python code. (I do not know how to get a python object representing a valid connected client).

I have been reading manuals from google and looked for code examples for hours... Alas, I cannot find any description of connecting a client using user account from my notebook.

Please, can someone help?


Solution

  • You can use the pydata-google-auth library to authenticate with a user account. This function loads credentials from a cache on disk or initiates an OAuth2.0 flow if the credentials are not found. This is not the recommended method to do an authentication.

    import pandas_gbq
    import pydata_google_auth
    
    SCOPES = [
        'https://www.googleapis.com/auth/cloud-platform',
        'https://www.googleapis.com/auth/drive',
    ]
    
    credentials = pydata_google_auth.get_user_credentials(
        SCOPES,
        # Set auth_local_webserver to True to have a slightly more convienient
        # authorization flow. Note, this doesn't work if you're running from a
        # notebook on a remote sever, such as over SSH or with Google Colab.
        auth_local_webserver=True,
    )
    
    df = pandas_gbq.read_gbq(
        "SELECT my_col FROM `my_dataset.my_table`",
        project_id='YOUR-PROJECT-ID',
        credentials=credentials,
    )
    

    The recommended way to do the authentication is to contact your GCP administrator and tell them to create a key for your account following the next instructions.

    Then you can use this code to set up the authentication with the key that you have:

    from google.oauth2 import service_account
    
    credentials = service_account.Credentials.from_service_account_file(
        '/path/to/key.json')
    

    You can see more of the documentation here.