Search code examples
sqlhiveconcatenationhiveqlgreenplum

greenplum STRING_AGG functions convert to hiveSQL


We have to migrate greenplum sql to hivesql and below query is not supporing string_agg keyword. kindly help us.

select data_date, subscriber_id, msisdn, product, validity,
    STRING_AGG(d0,'xx') d0, STRING_AGG(d1,'') d1, STRING_AGG(d2,'') d2, STRING_AGG(d3,'') d3, STRING_AGG(d4,'') d4,
    STRING_AGG(d5,'') d5, STRING_AGG(d6,'') d6, STRING_AGG(d7,'') d7, STRING_AGG(d8,'') d8, STRING_AGG(d9,'') d9, STRING_AGG(d10,'') d10,
    STRING_AGG(d11,'') d11
from tmp_subscription_base_02
group by 1,2,3,4,5

The string_agg is not supporting in hivesQL.


Solution

  • string_agg(expression, delimiter) aggregation function can be replaced in Hive with

    concat_ws(delimiter,collect_list(cast(expression as string))
    

    Or if you need to concatenate DISTINCT values:

    concat_ws(delimiter,collect_set(cast(expression as string))
    

    Note: if expression is of type string, cast(expression as string) is not necessary, use expression as is.