I have created a table in hive
with complex structure
in one column.
example records:
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| id | order | |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| 1 | [{"books":"prince","order_timestamp":"2022-01-19 00:45:22","check_out_timestamp":"2022-01-19 00:45:22"},{"books":"venceremos","order_timestamp":"2022-01-19 00:47:13","check_out_timestamp":null},{"books":"rich dad poor dad","order_timestamp":null,"check_out_timestamp":"2022-01-19 00:47:13"}] | |
| 2 | [{"books":"lord of flies","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 13:08:20"},{"books":"test","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 12:47:14"},{"books":"physics","order_timestamp":"2022-01-11 12:47:14","check_out_timestamp":"2022-01-11 12:47:14"}] | |
| 3 | [{"books":"test","order_timestamp":"2022-01-14 18:21:03","check_out_timestamp":"2022-01-14 18:21:03"},{"books":"up and down","order_timestamp":"2022-01-14 18:23:21","check_out_timestamp":"2022-01-14 18:23:21.018"},{"books":"mathematics","order_timestamp":"2022-01-14 18:23:21","check_out_timestamp":"2022-01-14 18:23:21"}] | |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
I want to ask, how can I perform some aggregations in this column? For example how many books ordered for 14/1/22 or the top selling book?
Explode array of struct using lateral view inline
and calculate. For example, how many books ordered per date:
select date(e.order_timestamp) order_date
count(*) as book_cnt
from table_name
lateral view inline(order) e as books, order_timestamp, check_out_timestamp
where date(e.order_timestamp) = date('2022-01-14')
group by date(e.order_timestamp)