Search code examples
pythonjsonjsonschemajsonschema2pojopython-jsonschema

Generate json schema and validate json against the schema


How do I generate a JSON schema in Python for the below JSON and validate every json against the schema?

Requirements:

  • There is a type which can be CSV or JSON or JSON
  • There is a list of properties.
  • Each property has a key which is a column name and the value is a dict which defines the column attributes. The first column col1 has an attributes type string and corresponds to index zero.
    {
            "type": "csv/json",
            "properties": {
                "col1": {
                    "type": "string",
                    "index":0
                },
                "col2": {
                    "type": "number",
                    "index":1
                },....
            }
        }

How do I generate a JSON schema for this json?

Sample Valid json

{
    "type": "csv",
    "properties": {
        "header1": {
            "type": "string",
            "index":0
        },
        "header2": {
            "type": "number",
            "index":1
        }   
    }
}

Sample invalid json (because the type is bool for header 1 and misses index attribute)

{
    "type": "CSV",
    "properties": {
        "header1": {
            "type": "bool"
        },
        "header2": {
            "type": "number",
            "index":1
        }   
    }
}


Solution

  • You can use jsonschema library for generate a JSON schema in Python and validate against the schema.

    Install jsonschema first by pip install jsonschema

    Now you can use jsonschema for generate a JSON schema for your JSON structure and validate it.

    for eg:

    import jsonschema
    from jsonschema import validate
    
    # Define the JSOn schema
    schema = {
        "type": "object",
        "properties": {
            "type": {"enum": ["csv", "json"]},
            "properties": {
                "type": "object",
                "patternProperties": {
                    "^.*$": {
                        "type": "object",
                        "properties": {
                            "type": {"type": "string"},
                            "index": {"type": "integer"},
                        },
                        "required": ["type", "index"],
                    }
                },
                "additionalProperties": False,
            },
        },
        "required": ["type", "properties"],
    }
    
    # Sample valid JSON
    valid_json = {
        # Your Sample valid JSON Goes here..
        },
    }
    
    # Sample invalid JSON
    invalid_json = {
        # Your Invalidate JSON Goes here..
        },
    }
    
    # Validate JSON against the schema
    try:
        validate(instance=valid_json, schema=schema)
        print("Valid JSON")
    except jsonschema.exceptions.ValidationError as e:
        print("Invalid JSON:", e)
    
    try:
        validate(instance=invalid_json, schema=schema)
        print("Valid JSON")
    except jsonschema.exceptions.ValidationError as e:
        print("Invalid JSON:", e)
    

    You can customize JSON schema according to your specific requirements.