Search code examples
jsonamazon-redshiftregexp-replace

How can i replace JSON values in Redshift?


I have a JSON array in my redshift table which has True/False/None values. Now since JSON only accepts lower case values and don't accept None, I want to convert those values into lower case true/false/null values. Bearing in my mind my JSON array has some keys whose values also contain the name True.

Example :

[{"id": 20198, "name": "True Ventures", "path": "true_ventures", "type": "Fund", "lead": False}, {"id": 324746, "name": "XXX", "path": "XXX", "type": "Investor", "url": "XXX", "image": "XXX", "lead": False}]

[{"id": 20198, "name": "True Ventures", "path": "true_ventures", "type": "Fund", "lead": True}, {"id": 324746, "name": "XXX", "path": "XXX", "type": "Investor", "url": "XXX", "image": "XXX", "lead": True}]

[{"id": 20198, "name": "True Ventures", "path": "true_ventures", "type": "Fund", "lead": None}, {"id": 324746, "name": "XXX", "path": "XXX", "type": "Investor", "url": "XXX", "image": "XXX", "lead": None}]

Now, I want to replace the False/True/None values occurrences across the entire JSON array wherever there is a True, False, None values. (in this case for the lead key and not the name and path of the company). I am currently using

case
       when column_name ilike ('%True%') then regexp_replace(replace(column_name, '\'', '"'), 'True', 'true')
       when column_name ilike ('%False%')then regexp_replace(replace(column_name, '\'', '"'), 'False', 'false')
       when column_name ilike ('%None%') then replace(replace(column_name,'\'','"'),'None','"None"')
       end as column_name

Please let me know what would be the right way to use the regexp_replace ?

Thanks!


Solution

  • Since this is json-like text a few assumptions can be made - that these keywords will always follow a colon (not in a list) and that colons won't appear in a field name. Also it is likely, but you will need to confirm, that a colon won't appear in a data value followed by one of these keywords AND that these keywords will follow a colon by only one space character. If this is all true then the change should be straight forward to make.

    Just replace ': True' with ': true'. The same holds for the other keywords. Your case statement is in error as it will only allow for one keyword to be replaced per row of text which isn't true for you example data. So as an untested example of what this will look like:

    Select replace(replace(replace(column_name, ': True', ': true'),
                ': False', ': false'),
                ': None', ': null') as column_name 
    from table_name;
    

    If a more general purpose approach is needed then sample data of the concerning corner-cases will be needed.