Search code examples
sqlfileapache-drill

Apache Drill cannot read windows local files


I'm new to Apache Drill. I want to try to use Drill to manage local data and configure the data storage path in the JSON file, but when I use SQL syntax to query the data, I can't read the local file.

I have installed Apache Drill and configured the file path in the JSON file (file name is 'data_access') as follows:

{
  "type": "file",
  "connection": "file:///",
  "workspaces": {
    "channel_0": {
      "location": "C:/Users/Admin/Desktop/DataFiles/Channel_0",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "lineDelimiter": "\n",
      "fieldDelimiter": ",",
      "quote": "\"",
      "escape": "\"",
      "comment": "#"
    },
    "parquet": {
      "type": "parquet"
    }
  },
  "authMode": "SHARED_USER",
  "enabled": true
}

When I use an SQL statement to query the data, the statement is as follows:

select * from data_access.channel_0.`20240806_205605.txt`;

The error is:

Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 57: Object '20240806_205605.txt' not found within 'data_access.channel_0'

I can guarantee that the file (20240806_205605.txt) exists and has permission to read the file, but I don't know why I just can't read the file.


Solution

    1. Use SHOW FILES IN data_access.channel_0 to get a view of what Drill sees there.
    2. Make sure that one of your format configurations, probably the "csv" one, includes the file name extension "txt" so that it will match 20240806_205605.txt.
    "extensions": [
        "txt"
    ]