I'm trying to connect to a remote instance of Databricks and write a csv file to a specific folder of the DBFS. I can find bits and pieces here and there but I'm not seeing how to get this done. How do I add the file to DBFS on a remote Databricks instance from a Java program running on my local machine?
I'm currently using a community instance I created from here: https://databricks.com/try-databricks
This is the url for my instance (I'm guessing the "o=7823909094774610" is identifying my instance).
https://community.cloud.databricks.com/?o=7823909094774610
Here's some of the resources I'm looking at trying to resolve this but I'm still not able to get off of the ground:
The Databricks Connect documentation: This talks about connecting but not specifically from Java. It gives and example of "connecting Eclipse" to data bricks that seems to be how to get the jar dependency for this (side question, is there a mvn version of this?). https://docs.databricks.com/dev-tools/databricks-connect.html#run-examples-from-your-ide
Some Java sample code: Doesn't seem to have an example of connecting to a remote Databricks instance https://www.programcreek.com/java-api-examples/index.php?api=org.apache.spark.sql.SparkSession
Databricks File System (DBFS) Documentation: Gives a good overview of file functions but doesn't seem to talk specifically about how to connect from a remote Java application and write the file to the Databricks instance from the Java application https://docs.databricks.com/data/databricks-file-system.html
FileStore documentation: Gives a good overview of file store but again doesn't seem to talk specifically about how to do this from a remote Java application https://docs.databricks.com/data/filestore.html
You could take a look at the DBFS REST API, and consider using that in your Java application.
If a Java solution is not required, then you could also take a look at the databricks-cli. After installing it with pip (pip install databricks-cli
) you simply have to:
databricks configure
databricks fs cp <source> dbfs:/<target>