Assume I have this CSV file:
a,b
NULL,1
1,2
Then I try doing some queries:
> SELECT * FROM dfs.`/path/to/foo.csv`;
+-------+----+
| a | b |
+-------+----+
| NULL | 1 |
| 1 | 2 |
+-------+----+
2 rows selected
> SELECT * FROM dfs.`/path/to/foo.csv` WHERE a is NULL;
+----+----+
| a | b |
+----+----+
+----+----+
No rows selected
> SELECT * FROM dfs.`/path/to/foo.csv` WHERE a is NOT NULL;
+-------+----+
| a | b |
+-------+----+
| NULL | 1 |
| 1 | 2 |
+-------+----+
2 rows selected
> SELECT * FROM dfs.`/path/to/foo.csv` WHERE a > 0;
Error: SYSTEM ERROR: NumberFormatException: NULL
Fragment 0:0
It seems Apache Drill is considering NULL
as a string.
Even if I replace it with an empty string (sed -i 's/NULL//g' foo.csv
), Apache Drill still thinks those values are not null.
Is there any way to make it consider NULL
(or empty value) as… well… null?
I'm using the latest release version: 1.9.0.
Absolutely you can!