Search code examples
kdb

Extracting multiple columns from a column containing JSON text in KDB


I have a table which contains a column having JSON text. I want to parse that column and extract the different attributes to multiple columns.

c:([] date:2?.z.d ;  client:( "{  \"firstName\": \"John\",  \"lastName\": \"Smith\", \"age\": 27 }" ; "{  \"firstName\": \"Scott\",  \"lastName\": \"Tiger\", \"age\":29 }" ) )  

Solution

  • Here is an alternative method to create a single json array from the entire json column.

    Running .j.k on a single string would be more efficient than running .j.k on many small arrays/dictionaries/strings.

    // test table
    q)c:([] date:2?.z.d ;  client:( "{  \"firstName\": \"John\",  \"lastName\": \"Smith\", \"age\": 27 }" ; "{  \"firstName\": \"Scott\",  \"lastName\": \"Tiger\", \"age\":29 }" ) );
    // combine each string with "," and encompass in "[]"
    // join each parsed dictionary with each row from c, keep client column for keep sake
    q)c,'exec .j.k {"[",x,"]"}"," sv client from c
    

    Often is the case with json capturing, that the data and messages are schemaless/unstructured. If that's the case and each json string doesn't necessarily have the same tags as others, then you can create a static map of your required output and join that map to each parsed json message. This will then have the affect of always resolving to a table. You can also type your map to ensure the resulting table is correctly typed.

    // insert row with missing age tag and new location tag
    q)`c insert (.z.d;"{\"firstName\": \"Sean\", \"lastName\": \"O'Hagan\", \"location\" : \"Dub\"}");
    // name and locations are strings, age is float
    q)map:{x,'first each x$\:()}[`firstName`lastName`age`location!10 10 9 11h];
    // manually edit symbol nulls to be varchars, so when casting to symbol the RHS param is always a varchar
    q).[`map;where map[;1]~\:`;:;(-11h;" ")];
    // join map to each parsed dictionary, and delete client column as we have extracted useful data
    q)delete client from c,'exec flip map[;0]$flip (map[;1],/:.j.k {"[",x,"]"}"," sv client) from c
    date       firstName lastName  age location
    -------------------------------------------
    2004.02.01 "John"    "Smith"   27
    2005.06.06 "Scott"   "Tiger"   29
    2018.03.13 "Sean"    "O'Hagan"     Dub
    q)meta delete client from c,'exec flip map[;0]$flip (map[;1],/:.j.k {"[",x,"]"}"," sv client) from c
    c        | t f a
    ---------| -----
    date     | d
    firstName| C
    lastName | C
    age      | f
    location | s
    

    HTH, Sean