How to create a table from a CSV?

SnappyData v.0.5

I want to do something similar to loading parquet files as found in the QuickStart load scripts.

CREATE TABLE STAGING_AIRLINEREF USING parquet OPTIONS(path '../../quickstart/data/airportcodeParquetData');

But, I have CSV files instead of parquet files. I do not see either the "USING parquet" or a CSV version in any RowStore documentation, so I took a guess and this fails.

CREATE TABLE STAGING_ROADS USING csv OPTIONS(path 'roads.csv');

How can I create a table directly from a CSV file where the header row is the column names and the rest are loaded as data rows?

EDIT

OK. Following Spark-CSV syntax, I load this file and get zero rows or table.

"roadId","name"
"1","Road 1"
"2","Road 2"
"3","Road 3"
"4","Road 4"
"5","Road 5"
"6","Road 6"
"7","Road 7"
"8","Road 8"
"9","Road 9"
"10","Road 10"


snappy> run '/home/ubuntu/data/example/load_roads.sql';
snappy> SET SCHEMA A;
0 rows inserted/updated/deleted
snappy> DROP TABLE IF EXISTS STAGING_ROADS;
0 rows inserted/updated/deleted
snappy> CREATE TABLE STAGING_ROADS
(road_id string, name string)
USING com.databricks.spark.csv
OPTIONS(path '/home/ubuntu/data/example/roads.csv', header 'true');
0 rows inserted/updated/deleted

Solution

You can use the the following way:

CREATE TABLE STAGING_ROADS USING com.databricks.spark.csv OPTIONS(path 'roads.csv', header "true");