Search code examples
sqlsnowflake-cloud-data-platformregexp-replacesnowflake-schema

regex to find "null" values from a column


I have a column JSON_DATA. The values within this column have an object type. It contains a json. This query works fine:

SELECT
    $1 AS "JSON_DATA"
 FROM TEST_TABLE 

This is how the data might look like:

{
  "Type": "xxi",
  "Quantity": "null",
  "ContactPerson": "null",
  "Email": null
}

Some of the values within this field are "null". I want to search all occurrences of quoted "nulls" and replace/regex_replace them with proper NULL values.

Desired output result:

{
  "Type": "xxi",
  "Quantity": NULL,
  "ContactPerson": NULL,
  "Email": null
}

I tried this:

SELECT
    REGEXP_REPLACE( $1 , "null",  NULL) AS "JSON_DATA"
 FROM TEST_TABLE 

but I get an error that:

SQL Error [904] [42000]: SQL compilation error: error line 2 at position 22
invalid identifier '"null"'

I believe its because "null", from the second parameter, isn't the correct regex for the task. How can I fix this?

Edit:

I also tried this:

SELECT
    REGEXP_REPLACE( $1 , '"null"',  NULL) AS "JSON_DATA"
 FROM TEST_TABLE 

but instead of finding and replacing all "null"s within the json objects of each row, the end result is just NULL for all rows.


Solution

  • the double quotes need to be in a normal string

    SELECT
        REGEXP_REPLACE( $1 , '"null"',  NULL) AS "JSON_DATA"
     FROM TEST_TABLE 
    

    As you have it, the DB is looking for a columns named null, that what snowflake reads double quotes as.

    You might also want to look at STRIP_NULL_VALUE and/or IF_NULL_VALUE

    Showing how to butcher JSON via REGEXP_REPLACE

    not that you should do this, but you haven't said what you really want to do:

    SELECT 
        column1,
        parse_json(column1) as json,
        REGEXP_REPLACE( json::text , '"null"',  '"DOGS"') AS fiddled_string,
        parse_json(fiddled_string) as back_to_json_data_again
        
    FROM VALUES ('{
      "Type": "xxi",
      "Quantity": "null",
      "ContactPerson": "null",
      "Email": null
    }')
    

    gives:

    COLUMN1 JSON FIDDLED_STRING BACK_TO_JSON_DATA_AGAIN
    { "Type": "xxi", "Quantity": "null", "ContactPerson": "null", "Email": null } { "ContactPerson": "null", "Email": null, "Quantity": "null", "Type": "xxi" } {"ContactPerson":"DOGS","Email":null,"Quantity":"DOGS","Type":"xxi"} { "ContactPerson": "DOGS", "Email": null, "Quantity": "DOGS", "Type": "xxi" }