Search code examples
jsonoracle19c

How to extract an entire JSON element from Oracle 19c CLOB


I have a table with JSON data stored in a CLOB. We get this data from an external source and recently they changed some formatting which causes issues with our post processing.
The data contains an object containing user roles and, when correctly formatted, arrays of locations associated to an individual role.

The problem arises when one user has a Role (Test Role 1) with a Location with just curly braces {} and another user also has the same role (Test Role 1) with a location with valid data. When we query the second user's roles we get null location date for that user.

In the sample data there is employeeID 1 (Whitbuckle, Dalongrirlum) who has roles of Test Role 1 and Test Role 2, each with a Location {} and employeeID 2 (Longblade, Skolout) with a role of Test Role 1 with valid locations. The other two users have either an empty EntitlementJSON attribute or Test Role 3 with valid location data.

When we query the data, employeeID 2 record has null roles even if we explicitly select only their employeeID.

Requested Solution: I am writing a validation procedure to makes sure that rows with the bad formatting gets identified. To do this, I would like to select into a variable the contents of the EntitlementJSON attribute for a single user. I would then check for the existance of "location":{}. If it exists this is a bad record. For example, what I would like to see for employeeID 1 is:

      "Test Role 1": {
        "dodaac": {},
        "fundCode": {},
        "glRepair": {},
        "location": {},
        "cognos": {},
        "jv": {}
      },
      "Test Role 2": {
        "dodaac": {},
        "fundCode": {},
        "glRepair": {},
        "location": {},
        "cognos": {},
        "jv": {}
      }

There is an example at this db<>fiddle

Code samples

CREATE TABLE TEST_JSON
    (    PROCESS_ID              NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY,
       JSON_DATA                             CLOB       CONSTRAINT check_json CHECK (JSON_DATA IS JSON)
    )
LOB (JSON_DATA) STORE AS SECUREFILE (
  ENABLE      STORAGE IN ROW
  CHUNK       8192
  RETENTION
  NOCACHE
  LOGGING);

-- TABLE ALTERS
ALTER TABLE TEST_JSON 
  ADD CONSTRAINT TEST_JSON_PK 
  PRIMARY KEY ( PROCESS_ID ) USING INDEX
  ENABLE;

    set serveroutput on

declare
      c                      clob;
  BEGIN

    c:= to_clob('[
      {
        "displayName": "Whitbuckle, Dalongrirlum",
        "employeeID": "1",
        "EntitlementJSON": {
          "Test Role 1": {
            "dodaac": {},
            "fundCode": {},
            "glRepair": {},
            "location": {},
            "cognos": {},
            "jv": {}
          },
          "Test Role 2": {
            "dodaac": {},
            "fundCode": {},
            "glRepair": {},
            "location": {},
            "cognos": {},
            "jv": {}
          }
        },
        "manager": "Urgaehilde Rubyforged",
        "company": "Bloodguard Industrie"
      },
      {
        "displayName": "Koboldbelly, Sitgrolin",
        "employeeID": "4",
        "EntitlementJSON": {},
        "manager": "Kogrubera Orcborn",
        "company": "Bloodguard Industrie"
     },
     {
        "displayName": "Longblade, Skolout",
        "employeeID": "2",
        "EntitlementJSON": {
          "Test Role 1": {
            "location": [
              "Rockwall Villa - RV",
              "Thunderbluff - TB"
            ]
          }
        },
        "manager": "Therrilyn Mithrilpike",
        "company": "Bloodguard Industrie"
      },
      {
        "displayName": "Warmcoat, Alfomdum",
        "employeeID": "3",
        "EntitlementJSON": {
          "Test Role 3": {
            "location": [
              "ALL"
            ]
          }
        },
        "manager": "Therrilyn Mithrilpike",
        "company": "Bloodguard Industrie"
      }
    ]');

    INSERT INTO TEST_JSON (JSON_DATA)
    VALUES (c);
    commit;

  END;

Here is the query we run:

select process_id,
     display_name,
     employeeID,
     manager,
     listagg(TR1) within group (order by process_id, display_name, employeeID, manager) Role_TR1,
     listagg(TR2) within group (order by process_id, display_name, employeeID, manager) Role_TR2,
     listagg(TR3) within group (order by process_id, display_name, employeeID, manager) Role_TR3,
     listagg(TR4) within group (order by process_id, display_name, employeeID, manager) Role_TR4
from (select j.process_id,
             jt.display_Name,
             jt.employeeID,                        
             jt.manager,                            
             TR1,
             TR2,
             TR3,
             TR4
        from test_json j
             cross apply JSON_TABLE(j.JSON_DATA, '$[*]'
               COLUMNS (display_Name                  VARCHAR2(200 CHAR)  PATH '$.displayName',
                        employeeID                    VARCHAR2(20 CHAR) PATH '$.employeeID',
                        manager                       VARCHAR2(200 CHAR) PATH '$.manager',       
                        nested path '$.EntitlementJSON."Test Role 1"' columns
                          (TR1 VARCHAR2(4000 CHAR) FORMAT JSON WITH WRAPPER PATH '$.location[*]'),      
                        nested path '$.EntitlementJSON."Test Role 2"' columns
                          (TR2 VARCHAR2(4000 CHAR) FORMAT JSON WITH WRAPPER PATH '$.location[*]'),      
                        nested path '$.EntitlementJSON."Test Role 3"' columns
                          (TR3 VARCHAR2(4000 CHAR) FORMAT JSON WITH WRAPPER PATH '$.location[*]'),      
                        nested path '$.EntitlementJSON."Test Role 4"' columns
                          (TR4 VARCHAR2(4000 CHAR) FORMAT JSON WITH WRAPPER PATH '$.location[*]')      
                      )) jt
        where process_id = 1)
  --and jt.employeeID = '2')
group by process_id, employeeID, display_name, manager;

Even when we un-comment the "and jt.employeeID = '2'" line we still get null locations for employeeID 2


Solution

  • You don't need to aggregate or use NESTED PATH:

    SELECT process_id,
           display_name,
           employeeID,
           manager,
           tr1,
           tr2,
           tr3,
           tr4
    from   test_json j
           CROSS APPLY JSON_TABLE(
             j.JSON_DATA, '$[*]'
             COLUMNS (
               display_Name VARCHAR2(200 CHAR) PATH '$.displayName',
               employeeID   VARCHAR2(20 CHAR)  PATH '$.employeeID',
               manager      VARCHAR2(200 CHAR) PATH '$.manager',
               tr1         JSON               PATH '$.EntitlementJSON."Test Role 1".location',
               tr2         JSON               PATH '$.EntitlementJSON."Test Role 2".location',
               tr3         JSON               PATH '$.EntitlementJSON."Test Role 3".location',
               tr4         JSON               PATH '$.EntitlementJSON."Test Role 4".location'
             )
           ) e
    WHERE  j.process_id = 1
    AND    e.employeeID = '2';
    

    Which, for the sample data, outputs:

    PROCESS_ID DISPLAY_NAME EMPLOYEEID MANAGER TR1 TR2 TR3 TR4
    1 Longblade, Skolout 2 Therrilyn Mithrilpike ["Rockwall Villa - RV","Thunderbluff - TB"] null null null

    fiddle