Search code examples
sqlarrayshivehiveqlnumeric

In Hive, how to convert an array of string to an array of numeric numbers


I have a hive table that looks like the following:

id | value
 1 | ['0', '0', '1', '0', '1', '1', '0', '0']
 2 | ['2', '0', '3', '0', '3', '1', '2', '1']

I want the result to be the following:

id | value
 1 | [0,0,1,0,1,1,0,0]
 2 | [2,0,3,0,3,1,2,1]

I need to convert them into an array of float so that I can use them in ST_Constains(ST_MultiPolygon(), st_point()) to determine if a point is in an area.

I am new to Hive, not sure if that is possible, any help would be very appreciated.


Solution

  • You can explode array, cast value, collect array again. Demo:

    with your_table as(
    select stack(2,
     1 , array('0', '0', '1', '0', '1', '1', '0', '0'),
     2 , array('2', '0', '3', '0', '3', '1', '2', '1')
     ) as (id,value)
     ) --use your_table instead of this
    
    
     select s.id, 
            s.value                            as original_array, 
            collect_list(cast(s.str as float)) as array_float 
     from
    (select t.*, s.* 
     from your_table t
                   lateral view outer posexplode(t.value)s as pos,str       
       distribute by t.id, t.value 
             sort by s.pos --preserve order in the array
     )s  
    group by s.id, s.value;  
    

    Result:

    OK
    1       ["0","0","1","0","1","1","0","0"]       [0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0]
    2       ["2","0","3","0","3","1","2","1"]       [2.0,0.0,3.0,0.0,3.0,1.0,2.0,1.0]
    

    See also this answer about sorting array in the query https://stackoverflow.com/a/57392965/2700344