I uploaded the data.csv to Microsoft Azure Storage Explorer. Then copied the url and created a table in databricks.
%sql
DROP TABLE If EXISTS data;
CREATE TABLE IF NOT EXISTS data
USING CSV
OPTIONS (header "true", inferSchema "true")
LOCATION "url/data.csv"
Now I want to use sparklyr to manupulate "data".
How should I convert above data to a sparklyr dataframe to use the full potential of sparklyr?
First you must initialise your sparklyr session as follows:
sc = spark_connect(method = 'databricks')
you can then read directly from your SQL tables using:
sdf_sql(sc, 'SELECT * From ...')
and then perform all of the usual sparklyr/dplyr work as normal.
Note that databricks clusters do not come pre-loaded with sparklyr as they want to push you towards using the SparkR API to interact with your data instead. If you with to use the sparklyr API you must install and load sparklyr each time your start the cluster.