I have a table with 6 fields that looks like this:
Field1 Field2 Field3 Field4 Field5 Field6
ABC 45 XYZ JKL BNM
65 QWE JKL
WER YUI IOP GHJ
I want to pull the data from that table above to a new table with 5 fields where we ignore the empty values. My final table should look like this:
Result1 Result2 Result3 Result4 Result5
ABC 45 XYZ JKL BNM
65 QWE JKL
WER YUI IOP GHJ
I have started to write a massive conditional query with CASE WHEN but it's getting out of hand and prone to errors. Is it possible to get that table using a regex_extract query in Hive?
Assuming "empty values" are Nulls
select fields[0] as Field1
,fields[1] as Field2
,fields[2] as Field3
,fields[3] as Field4
,fields[4] as Field5
from (select split(concat_ws(string(unhex(1)),*),'\\x01') as fields
from mytable
) t
+--------+--------+--------+--------+--------+
| field1 | field2 | field3 | field4 | field5 |
+--------+--------+--------+--------+--------+
| ABC | 45 | XYZ | JKL | BNM |
| 65 | QWE | JKL | (null) | (null) |
| WER | YUI | IOP | GHJ | (null) |
+--------+--------+--------+--------+--------+
Simplified version, assuming comma (,
) does not appear in your fields:
select ...
from (select split(concat_ws(',',*),',') as fields
from mytable
) t
Assuming "empty values" are empty strings
select fields[0] as Field1
,fields[1] as Field2
,fields[2] as Field3
,fields[3] as Field4
,fields[4] as Field5
from (select split(regexp_replace(concat_ws(string(unhex(1)),*),'^\\x01+|\\x01+$|(\\x01)+','$1'),'\\x01') as fields
from mytable
) t
+--------+--------+--------+--------+--------+
| field1 | field2 | field3 | field4 | field5 |
+--------+--------+--------+--------+--------+
| ABC | 45 | XYZ | JKL | BNM |
| 65 | QWE | JKL | (null) | (null) |
| WER | YUI | IOP | GHJ | (null) |
+--------+--------+--------+--------+--------+
Simplified version, assuming comma (,
) does not appear in your fields:
select ...
from (select split(regexp_replace(concat_ws(',',*),'^,+|,+$|(,)+','$1'),',') as fields
from mytable
) t