Search code examples
pythonamazon-redshiftudf

How to make my Amazon Redshift UDF receive a string instead of a list?


I created this function in order to receive a listagg() as a parameter and remove the duplicates.

create or replace function reports.f_listagg_distinct (l varchar(20000))
    returns varchar(20000)
immutable
as $$
aux_list = l.split(',')
nl = []
for i in aux_list:
    if i not in nl:
        nl.append(i)
return nl
$$ language plpythonu;

When I run the code below I receive this Error: TypeError: expected string or Unicode object, list found

select reports.f_listagg_distinct('teste,teste,teste');

What am I doing wrong?


Solution

  • If you're looking to return a distinct list of comma-separated objects, use this function from remove duplicates from comma separated string (Amazon Redshift):

    CREATE FUNCTION f_uniquify (s text)
      RETURNS text
    IMMUTABLE
    AS $$
      -- Split string by comma, remove duplicates, convert back to comma-separated
      return ', '.join(set(s.split(',')))
    $$ LANGUAGE plpythonu;
    

    Testing it with:

    select f_uniquify('test,teste,tester,teste,test');
    

    Returns:

    test, teste, tester