Search code examples
pythonjsonschemapython-jsonschema

Is the Python jsonschema validator using a superset of the actual jsonschema?


When using Python jsonschema it is possible to define schemas and instances that cannot be expressed in valid JSON.

>>> import jsonschema
>>> schema = {
...   "type": "object",
...   "properties": {"1": {}, 2:{}},
...   "additionalProperties": False
... }

Now

>>> jsonschema.validate({"1": "spam", 2: "eggs"}, schema)

does not raise an exception, while the code below fails:

>>> jsonschema.validate({1: "spam"}, schema)
Traceback (most recent call last):
   ...
jsonschema.exceptions.ValidationError: Additional properties are not allowed (1 was unexpected)

Failed validating 'additionalProperties' in schema:
    {'additionalProperties': False,
     'properties': {2: {}, '1': {}},
     'type': 'object'}

On instance:
    {1: 'spam'}

I'm a little confused here: the Python mapping {"1": "spam", 2: "eggs"} cannot be serialised in a valid JSON object, and the same applies to the schema mapping above. (In JSON objects are name/value mapping where the name has to be a string, and cannot be an integer or another data type).

Is this intended behaviour, i.e. the jsonschema semantics is extended to include more general python data types, or is the above use of schema invalid and should be flagged as an error by the jsonschema library? I read the docs, but was not able to find a mention to this point.


Solution

  • The Python jsonschema library, like most JSON Schema libraries, does not in fact operate on JSON. JSON is text. JSON Schema libraries operate generally on language-level objects, ones that JSON libraries deserialize into.

    So yes, there are Python dicts you can construct that could never have come from JSON, like the one you have there.

    The type though that jsonschema.validate takes is dict-that-came-from-JSON, so yes, if you gave it one that could never be JSON, you are going to get unexpected results (e.g., a current or future version of jsonschema is free to assume all keys are already strings, and you may see TypeErrors from places trying to perform string operations without converting first).