I have a hive table that looks like the following:
id | value
1 | ['0', '0', '1', '0', '1', '1', '0', '0']
2 | ['2', '0', '3', '0', '3', '1', '2', '1']
I want the result to be the following:
id | value
1 | [0,0,1,0,1,1,0,0]
2 | [2,0,3,0,3,1,2,1]
I need to convert them into an array of float so that I can use them in ST_Constains(ST_MultiPolygon(), st_point())
to determine if a point is in an area.
I am new to Hive, not sure if that is possible, any help would be very appreciated.
You can explode array, cast value, collect array again. Demo:
with your_table as(
select stack(2,
1 , array('0', '0', '1', '0', '1', '1', '0', '0'),
2 , array('2', '0', '3', '0', '3', '1', '2', '1')
) as (id,value)
) --use your_table instead of this
select s.id,
s.value as original_array,
collect_list(cast(s.str as float)) as array_float
from
(select t.*, s.*
from your_table t
lateral view outer posexplode(t.value)s as pos,str
distribute by t.id, t.value
sort by s.pos --preserve order in the array
)s
group by s.id, s.value;
Result:
OK
1 ["0","0","1","0","1","1","0","0"] [0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0]
2 ["2","0","3","0","3","1","2","1"] [2.0,0.0,3.0,0.0,3.0,1.0,2.0,1.0]
See also this answer about sorting array in the query https://stackoverflow.com/a/57392965/2700344