Search code examples
parquetapache-drill

Exception whilst querying decimal fields in Apache Drill


I am trying to run the following query in apache drill. I am querying data stored in parquet files using the following query

select 
    pan, count(*) as number_of_transactions,
    terminal_id,
    sum((cast(SETTLE_AMOUNT_IMPACT as double) * -1) / 100) as settle_amount_impact
from 
    dfs.`/iswdata/storage/products/superswitch/parquet/transactions`
where 
    pan like '506126%' 
    and terminal_id like '1%' 
    and sink_node_name like ('SWTDB%') 
    and source_node_name not like ('SWTDBLsrc')
    and tran_completed = 1
    and tran_reversed = 0
    and tran_postilion_originated = 1
    and tran_type = '01'
    --and pan like '506126%0011'
group by 
    pan, terminal_id

The schema for the data I am querying is as follows

 post_tran_id LONG 2
 post_tran_cust_id :LONG
 settle_entity_id :INTEGER
 batch_nr : INTEGER  
 prev_post_tran_id : LONG
 next_post_tran_id : LONG 
 sink_node_name : STRING 
 tran_postilion_originated : DECIMAL 
 tran_completed : DECIMAL
 tran_amount_req : DECIMAL 
 tran_amount_rsp : DECIMAL 
 settle_amount_impact : DECIMAL 
 tran_cash_req : DECIMAL 
 tran_cash_rsp : DECIMAL 
tran_currency_code : STRING 
tran_tran_fee_req : DECIMAL 
tran_tran_fee_rsp : DECIMAL 
tran_tran_fee_currency_code : STRING 
tran_proc_fee_req : DECIMAL 
tran_proc_fee_rsp : DECIMAL 
tran_proc_fee_currency_code : STRING 
settle_amount_req : DECIMAL 
settle_amount_rsp : DECIMAL 
settle_cash_req : DECIMAL 
settle_cash_rsp : DECIMAL 
settle_tran_fee_req : DECIMAL 
settle_tran_fee_rsp : DECIMAL 
settle_proc_fee_req : DECIMAL 
settle_proc_fee_rsp : DECIMAL 
settle_currency_code : STRING

However When I run the query against the dataset, I get the following exception

SYSTEM ERROR: ClassCastException: org.apache.drill.exec.vector.NullableDecimal28SparseVector cannot be cast to org.apache.drill.exec.vector.VariableWidthVector

More so, the same error occurs when I include a decimal field in the select clause. Please, is there something I am missing or doing wrong, Any pointer will be deeply appreciated

Kind Regards


Solution

  • Decimal values in parquet table are stored using BINARY primitive type, but currently, Drill does not support decimals stored as binary.

    It will be fixed in DRILL-6094 and it will be available in 1.14 release.