Search code examples
etlverticaflextable

Vertica: Parse string as JSON when inserting data from table into flextable


I use Vertica 9.2.1 in EON-mode.I have a fact table with a column that holds JSON strings. I want to load this data together with some identifiers from the fact table into a flextable. So that we can run analysis on that data. What I want to avoid is, loading all the necessary data onto an ETL machine to transform the data and then load it into the flextable, since all the data is already available in Vertica. How can I tell Vertica to parse a VARCHAR column as JSON?

CREATE TABLE public.tmp_facts ("id" INTEGER, "user_id" VARCHAR(64), "event_type" VARCHAR(50), /* other columns omitted */ "additional" VARCHAR(65000));
INSERT INTO public.tmp_facts ("id", "user_id", "event_type", "additional")
SELECT 1, 'user1', 'event1', '{"os":"Android", "time":"'||NOW()||'"}';
CREATE FLEX TABLE public.fact_additional
(
    "id" INTEGER NOT NULL,
    "user_id" VARCHAR(64) NOT NULL,
    "event_type" VARCHAR(50)
);
INSERT INTO public.fact_additional ("id", "user_id", "event_type")
SELECT "id", "user_id", "event_type", "additional" FROM tmp_facts;

SELECT "additional", "additional.os", "additional[os]" FROM fact_additional;

I expected that the last query outputs for at least one column Android


Solution

  • You need to pass the additional column through MapJSONExtractor() function when inserting from public.tmp_facts into public.fact_additional

    INSERT INTO public.fact_additional ("id", "user_id", "event_type")
    SELECT "id",
           "user_id",
           "event_type", 
           MapJSONExtractor("additional") as additional 
    FROM tmp_facts;
    
    SELECT "additional"['os'] as os FROM fact_additional;
    
       os    
    ---------
     Android
    (1 row)
    

    Notice the usage of single / double quotes at the appropriate places.