Search code examples
sqlclickhouse

How should I construct source `String` so as to transform and put it into `Array(Tuple(Int, String, String))` of target table?


I am reading impala parquet from HDFS into Clickhouse. Target table in clickhouse has 2 complex types:

target_type1 Array(Tuple(  LowCardinality(String),
                      Int 8,
                      Int 32, 
                      Int 32,
                      Int 32
         ))
target_type2 Array(Tuple(  LowCardinality(String),
                      LowCardinality(String), 
                      LowCardinality(String)
         ))

Source table's type is String. How should I construct souce table's String and transform it into the target type?

For example,

Target table

create table test (
    a String,
    b Array(String),
    c Array(Tuple(Int, String, String))
) engine = MergeTree order by tuple()

Source

with t as (
    select '123#def#aaa|456#xxx#aaa|789#bbbb#aaaa|3333#www#aaaa' as x, splitByChar('|', x) as y, splitByChar('#', y[1]) as z
) select * from t;

How should I construct the source string x and transform it, so as to load into c Array(Tuple(Int, String, String))?


Solution

  • Something like this?

    INSERT INTO test (c) SELECT groupArray(new) AS val
    FROM
    (
        SELECT
            '123#def#aaa|456#xxx#aaa|789#bbbb#aaaa|3333#www#aaaa' AS x,
            splitByChar('|', x) AS y,
            splitByChar('#', arrayJoin(y)) AS s,
            (toInt32(s[1]), s[2], s[3]) AS new
    )
    

    Output is:

    SELECT *
    FROM test
    
    Query id: e087e0b8-af15-45e9-a4bb-ce22cc2d6f6a
    
    ┌─a─┬─b──┬─c─────────────────────────────────────────────────────────────────────────────┐
    │   │ [] │ [(123,'def','aaa'),(456,'xxx','aaa'),(789,'bbbb','aaaa'),(3333,'www','aaaa')] │
    └───┴────┴───────────────────────────────────────────────────────────────────────────────┘
    
    1 row in set. Elapsed: 0.002 sec.
    

    I think it could be done without expanding the array with some lambda functions too.