Search code examples
google-cloud-platformgoogle-bigquerybq

Code fails to update a table on BQ using DML, but succeeds for insertion and deletion with RPC


I wrote some code that uses service-account to write to BQ on google-cloud.

A very strange thing is that only "update" operation using DML fails. (Other insertion, deletion RPC calls succeeds).

    def create_table(self, table_id, schema):
        table_full_name = self.get_table_full_name(table_id)
        table = self.get_table(table_full_name)
        if table is not None:
            return  # self.client.delete_table(table_full_name, not_found_ok=True)  # Make an API
            # request.  # print("Deleted table '{}'.".format(table_full_name))
        table = bigquery.Table(table_full_name, schema=schema)
        table = self.client.create_table(table)  # Make an API request.
        print("Created table {}.{}.{}".format(table.project, table.dataset_id, table.table_id))


#Works!
    def upload_rows_to_bq(self, table_id, rows_to_insert):
        table_full_name = self.get_table_full_name(table_id)
        for ads_chunk in split(rows_to_insert, _BQ_CHUNK_SIZE):
            errors = self.client.insert_rows_json(table_full_name, ads_chunk,
                row_ids=[None] * len(rows_to_insert))  # Make an API request.
            if not errors:
                print("New rows have been added.")
            else:
                print("Encountered errors while inserting rows: {}".format(errors))

#Permissions Failure
    def update_bq_ads_status_removed(self, table_id, update_ads):
        affected_rows = 0
        table_full_name = self.get_table_full_name(table_id)
        for update_ads_chunk in split(update_ads, _BQ_CHUNK_SIZE):
            ad_ids = [item["ad_id"] for item in update_ads_chunk]
            affected_rows += self.update_bq_ads_status(f"""
                            UPDATE {table_full_name} 
                            SET status = 'Removed' 
                            WHERE ad_id IN {tuple(ad_ids)} 
                            """)
        return affected_rows

I get this error for update only:

User does not have bigquery.jobs.create permission in project ABC.


Solution

  • I will elaborate on my comment.

    In GCP you have 3 types of IAM roles.

    include the Owner, Editor, and Viewer roles.

    provide granular access for a specific service and are managed by Google Cloud. Predefined roles are meant to support common use cases and access control patterns.

    provide granular access according to a user-specified list of permissions.

    What's the difference between predefinied and custom roles? If you change (add/remove) permission for a predefinied role it will become custom role.

    Predefinied roles for BigQuery with permissions list can be found here

    Mentioned error:

    User does not have bigquery.jobs.create permission in project ABC.

    Means that IAM Role doesn't have specific BigQuery Permission - bigquery.jobs.create.

    bigquery.jobs.create permission can be found in two predefinied roles like:

    • BigQuery Job User - (roles/bigquery.jobUser)
    • BigQuery User - (roles/bigquery.user)

    Or can be added to a different predefinied role, however it would change to custom role.

    Just for addition, in Testing Permission guide, you can find information on how to test IAM permissions.