Search code examples
google-cloud-platformgoogle-cloud-billingbigquery-udf

BigQuery UDF called from Different project - Who gets the bill?


Scenario:

  • Project A hosts the BigQuery UDFs.
  • Project B is compute project. A compute job running in Project B will call UDFs hosted in Project A.

As I understand, UDFs are executed once per value. So if I have a column C1 which has 100 values, UDF will be called 100 times. So this should be good amount of compute happening in BQ.

Questions

  1. Where exactly BigQuery UDF compute occurs? In project A or B?
  2. which project gets billed for this?

Ideally, we would want project B to have this cost. As these UDFs will be accessed by multiple compute projects and Project A should not bear the cost.


Solution

  • The compute is done on the project B, the project which runs the query.

    The cost is more "complex". In any case, it's never paid by the project A!

    • If you are in "on-demand" pricing, you pay only the volume of data you scan, whatever the number of slots used (you have a quota of 2k slots per project). Because you are not paying the slots, only the volume of scanned data, the UDF are free (it's compute and compute is not charged)
    • If you are in "capacity", which mean you have reservation and slots, your query will use more slots with UDF and therefore you will pay for them

    This answer applies only for UDF. For Remote UDF, it's different. Remote UDF is a Cloud Run Functions (or a Cloud Run Service) which will process the BigQuery row. The BigQuery job will send chunks of row to the remote UDF to process them.

    Because the remote UDF will use CPU and Memory when it runs, it's the project which hosts those remote UDF that will pay for them