Search code examples
csvapache-drill

Apache Drill query column names from csv data


I have a csv file in local file system which i can query like

SELECT * FROM  dfs.`/Users/HOF/Downloads/cars.csv`;

Note: I have

  "skipFirstLine": true,
  "extractHeader": true,

in the storage plugin for csv

Data in the csv file looks like

Name,Mileage,Cylinders,Displacement,Horsepower,Origin
ford torino,17,8,302,140,USA
ford galaxie 500,15,8,429,198,USA
...

Now i want the field information to be returned by a query like

| COLUMN_NAME | DATATYPE |
|-------------|----------|
| Name        | *        |
| Mileage     | *        |
| Cylinders   | *        |
...

I tried with

DESCRIBE dfs.`/Users/HOF/Downloads/cars.csv`;

but getting an empty list of column

|-------------|-----------|-------------|
| COLUMN_NAME | DATA_TYPE | IS_NULLABLE |
|-------------|-----------|-------------|
|-------------|-----------|-------------|

Solution

  • Currently DESCRIBE does not support tables created in a file system [1]. It works with views though, so if you'll create view over your data, you might get desired result. See more in describe section.

    [1] https://drill.apache.org/docs/describe/