Search code examples
snappydata

builtin provider com.databricks.spark.csv not found in SnappyData v.0.5.2


SnappyData v.0.5.2

I am using this SnappyData version to get a fix for SNAP-961.

However, now I am unable to load data from a CSV anymore, after moving from the preview release v0.5 to v0.5.2.

ERROR IS:

ERROR 38000: (SQLState=38000 Severity=-1) (Server=ip-10-0-18-66.us-west-2.compute.internal[1528],Thread[DRDAConnThread_28,5,gemfirexd.daemons]) The exception 'Failed to find a builtin provider com.databricks.spark.csv;' was thrown while evaluating an expression.

Here is what I am executing:

-- creates in-memory table from csv
CREATE TABLE STAGING_ROAD (road_id string, name string) USING com.databricks.spark.csv OPTIONS(path 'roads.csv', header 'true', inferSchema 'false');

Solution

  • There has been an alignment of SQL and Spark APIs so now only builtin datasources (column, row, streaming/AQP ones) can use "CREATE TABLE" while others have to use "CREATE EXTERNAL TABLE". Similar was the case with SnappyContext where createTable API could be used only for builtin sources while for others createExternalTable was required. The following should work with both older releases and newer ones:

    CREATE EXTERNAL TABLE STAGING_ROAD (road_id string, name string) USING com.databricks.spark.csv OPTIONS(path 'roads.csv', header 'true', inferSchema 'false')