Is there an equivalent method to the R/Python put_file()
methods for taking an object from a Scala notebook in DSX and saving it as a data asset for the project? If so is there any documentation? Looking for something like what was outlined in this article:
https://datascience.ibm.com/blog/working-with-object-storage-in-data-science-experience-python-edition/
I have already written the csv file I want within the notebook, just need to save it to the project!
Try following steps and code snippets -
Step 1 : First generate the credentials. You should be able to generate it by clicking (for any file already uploaded from your browser) the 'Insert to Code->Insert Spark Session Dataframe' from File tab of 'File and Add Data' pane in DSX.
def setHadoopConfig2db1c1ff193345c28eaffb250b92d92b(name: String) = {
val prefix = "fs.swift.service." + name
sc.hadoopConfiguration.set(prefix + ".auth.url", "https://identity.open.softlayer.com" + "/v3/auth/tokens")
sc.hadoopConfiguration.set(prefix + ".auth.endpoint.prefix","endpoints")
sc.hadoopConfiguration.set(prefix + ".tenant", "<tenant id>")
sc.hadoopConfiguration.set(prefix + ".username", "<userid>")
sc.hadoopConfiguration.set(prefix + ".password", "<password.")
sc.hadoopConfiguration.setInt(prefix + ".http.port", 8080)
sc.hadoopConfiguration.set(prefix + ".region", "dallas")
sc.hadoopConfiguration.setBoolean(prefix + ".public", false)
}
val name = "keystone"
setHadoopConfig2db1c1ff193345c28eaffb250b92d92b(name)
val data_frame1 = spark.read.option("header","true").csv("swift://'Your
DSXProjectName'.keystone/<your file name>.csv")
Step 2 : some code which creates data_frame2 from data_frame1 after say some transformation
Step 3 : Use the same container and project name while saving data of data_frame2 to a file in object store
data_frame2.write.option("header","true").csv("swift://'Same DSXproject name as before'.keystone/<name of the file u want to write the data>.csv")
Please note that you can generate the credential in step 1 and can use it for saving any dataframe in your current notebook without even reading data from any file.