Search code examples
sqlapache-drill

How to get more error details in Apache Drill


 Is there any way how to get more details from SQL errors?

Speaking in terms of data-related errors, drill does not give any clue where and how to find the issue. SQL syntax and logic errors are rendered somewhat comprehensible, but imagine these cases:

Classic example

You have ~10GB CSV full of numbers (sales.csv):

ArticleId,CategoryId,Price,SupplierPrice,VAT
1234,23,15.19,12.45,0
1235,23,16.19,13.45,0.15
...
[83541670] lines
...
475,34.0,55.0,50,0.15  # This random error cause (CategoryId should be INT for this example)
...
[34767806] lines
...
[EOF]

Then consider a similar query:

SELECT 
 CAST (ArticleId as INT) as ArticleId,
 CAST (CategoryId as INT) as CategoryId,
 CAST (Price as DOUBLE) as Price,
 CAST (SupplierPrice as double) as SupplierPrice,
 CAST (VAT as DOUBLE) as VAT
from (...)/sales.csv

And you got this error:

SYSTEM ERROR: NumberFormatException: 
Fragment 0:0
Please, refer to logs for more information.

Well, referring to logs:

[Error Id: 682cc450-61fb-4307-809a-fcb794fec692 on drill-staging-745f9968d4-m5pv7:31010]
    at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:630) ~[drill-common-1.16.0.jar:1.16.0]
    at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:363) [drill-java-exec-1.16.0.jar:1.16.0]
    at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:219) [drill-java-exec-1.16.0.jar:1.16.0]
    at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:329) [drill-java-exec-1.16.0.jar:1.16.0]
    at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.16.0.jar:1.16.0]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111-internal]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111-internal]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111-internal]
Caused by: java.lang.NumberFormatException: 
    at org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:96) ~[drill-java-exec-1.16.0.jar:1.16.0]
[...continues with hundred similar lines ...]

The problem

  • You don't know what happened, only that it has to do something with some number.
  • You don't know where that happened
  • When referring to logs, there is no useful information
  • It happens often.
  • Investigating these errors takes huge amount of time, because every time, you have to go through hundred millions lines and search the error by some work-around approximation queries.
  • Using drill for data-cleaning makes it impossible just for these reasons. We would have to use another big-data solution and at this point we wouldn't have to use Drill at all.

 Questions

  • Is there any option to increase verbosity?
  • Aren't there any other logs / sources where to get error detail?
  • Is there any useful workaround how to handle these situations while not ruining the performance? (There are many combinations, ie '' instead of null and vice versa, wrong datatype for number, too many decimal places for a decimal, ...)

When it would be so beautiful when Drill would give us errors like ...:

ERROR casting column "Price" to Int - Invalid integer value "34.0"

Solution

  • Cast operation is applied after data was read thus at this point Drill does not have information from where in file incorrect data came from. Since Drill 1.16 schema provisioning support was introduced. Table schema is applied during data read, so Drill can produce better error messages (DATA_READ_ERROR). For more information see https://drill.apache.org/docs/create-or-replace-schema/.