Search code examples
google-cloud-platformgoogle-cloud-dataproc

Native Delta Lake support on GCP Dataproc


As per the documentation https://cloud.google.com/blog/topics/developers-practitioners/how-build-open-cloud-datalake-delta-lake-presto-dataproc-metastore, delta is natively supported on image versions > 1.5 and jars should be available in the path /usr/lib/delta/jars.

Question:

I have image version '2.1.18-debian11' , but do not see delta-core.jar under the path /usr/lib/delta/jars. Also do not see any option to choose delta (similar to Jupyter notebook) when launching new dataproc cluster. Would like to know if Delta is natively supported on dataproc?


Solution

  • Delta lake is not installed in Dataproc 2.0+, users can bring their own delta lake jars as their job dependencies.