Search code examples
jsonpostgresqlplpgsql

How to prevent 'invalid input syntax for type json' in Postgres, when records contain a mix of json or strings


I have a text column that contains JSON and also plan text. I want to convert it to JSON, and then select a particular property. For example:

user_data
_________
{"user": {"name": "jim"}}
{"user": {"name": "sally"}}
some random data string

I've tried:

select user_data::json#>'{user,name}' from users

I get:

ERROR:  invalid input syntax for type json
DETAIL:  Token "some" is invalid.
CONTEXT:  JSON user_data, line 1: some...

Is it possible to prevent this?


Solution

  • If you want to skip the rows with invalid JSON, you must first test if the text is valid JSON. You can do this by creating a function which will attempt to parse the value, and catch the exception for invalid JSON values.

    CREATE OR REPLACE FUNCTION is_json(input_text varchar) RETURNS boolean AS $$
      DECLARE
        maybe_json json;
      BEGIN
        BEGIN
          maybe_json := input_text;
        EXCEPTION WHEN others THEN
          RETURN FALSE;
        END;
    
        RETURN TRUE;
      END;
    $$ LANGUAGE plpgsql IMMUTABLE;
    

    When you have that, you could use the is_json function in a CASE or WHERE clause to narrow down the valid values.

    -- this can eliminate invalid values
    SELECT user_data::json #> '{user,name}'
    FROM users WHERE is_json(user_data);
    
    -- or this if you want to fill will NULLs
    SELECT
      CASE
        WHEN is_json(user_data)
          THEN user_data::json #> '{user,name}'
        ELSE
          NULL
      END
    FROM users;