Search code examples
jsonschema

JSONSchema: Google GeoCoding API - address components containing mandatory objects with arrays with at least one specific value


Assume the following JSON from Google GeoCoding API

{
    "lon": 1.2345678,
    "lat": 1.2345678,
    "formatted_adress": "Sunny Ave, 01234 Dev City, New Devland",
    "address_components": [
        {
            "long_name": "Sunny Ave",
            "short_name": "Sunny Ave",
            "types": [
                "route"
            ]
        },
        {
            "long_name": "Downtown",
            "short_name": "Downtown",
            "types": [
                "sublocality_level_1",
                "sublocality",
                "political"
            ]
        },
        {
            "long_name": "Dev City",
            "short_name": "Dev City",
            "types": [
                "locality",
                "political"
            ]
        },
        {
            "long_name": "Dev City",
            "short_name": "Dev City",
            "types": [
                "administrative_area_level_3",
                "political"
            ]
        },
        {
            "long_name": "Somwherechester",
            "short_name": "SC",
            "types": [
                "administrative_area_level_1",
                "political"
            ]
        },
        {
            "long_name": "New Devland",
            "short_name": "DE",
            "types": [
                "country",
                "political"
            ]
        },
        {
            "long_name": "01234",
            "short_name": "01234",
            "types": [
                "postal_code"
            ]
        }
    ]
}

I need to validate the data as shown (types, properties, etc.).

Except:

address_components MUST contain all objects where types contains at least one element of [locality, country, postal_code]. Additional objects are allowed.

Based on this solution https://github.com/orgs/json-schema-org/discussions/191 I managed to get close enough if types would be a constant string. But I didn't manage to validate against an array of strings.

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "",
    "title": "Google APIs Geo Coding",
    "description": "Necessary address components.",
    "type": "object",
    "required": [
        "formatted_adress",
        "lon",
        "lat",
        "address_components"
    ],
    "properties": {
        "formatted_adress": {
            "type": "string"
        },
        "lon": {
            "type": "number"
        },
        "lat": {
            "type": "number"
        },
        "address_components": {
            "type": "array",
            "items": {
                "type": "object",
                "required": [
                    "long_name",
                    "short_name",
                    "types"
                ],
                "properties": {
                    "long_name": {
                        "type": "string"
                    },
                    "short_name": {
                        "type": "string"
                    },
                    "types": {
                        "type": "string"
                    }
                }
            },
            "allOf": [
                {
                    "contains": {
                        "properties": {
                            "types": {
                                "const": "locality"
                            }
                        }
                    }
                },
                {
                    "contains": {
                        "properties": {
                            "types": {
                                "const": "country"
                            }
                        }
                    }
                },
                {
                    "contains": {
                        "properties": {
                            "types": {
                                "const": "postal_code"
                            }
                        }
                    }
                }
            ]
        }
    }
}

Examples

invalid data => there's no object with postal_code.

{
    "lon": 1.2345678,
    "lat": 1.2345678,
    "formatted_adress": "Sunny Ave, 01234 Dev City, New Devland",
    "address_components": [
        {
            "long_name": "Sunny Ave",
            "short_name": "Sunny Ave",
            "types": [
                "route"
            ]
        },
        {
            "long_name": "Downtown",
            "short_name": "Downtown",
            "types": [
                "sublocality_level_1",
                "sublocality",
                "political"
            ]
        },
        {
            "long_name": "Dev City",
            "short_name": "Dev City",
            "types": [
                "locality",
                "political"
            ]
        },
        {
            "long_name": "Dev City",
            "short_name": "Dev City",
            "types": [
                "administrative_area_level_3",
                "political"
            ]
        },
        {
            "long_name": "Somwherechester",
            "short_name": "SC",
            "types": [
                "administrative_area_level_1",
                "political"
            ]
        },
        {
            "long_name": "New Devland",
            "short_name": "DE",
            "types": [
                "country",
                "political"
            ]
        }
    ]
}

valid data => there're all 3 objects with at least one of the mentioned types - no additional (optional) objects

{
    "lon": 1.2345678,
    "lat": 1.2345678,
    "formatted_adress": "Sunny Ave, 01234 Dev City, New Devland",
    "address_components": [
        {
            "long_name": "Dev City",
            "short_name": "Dev City",
            "types": [
                "locality",
                "political"
            ]
        },
        {
            "long_name": "New Devland",
            "short_name": "DE",
            "types": [
                "country",
                "political"
            ]
        },
        {
            "long_name": "01234",
            "short_name": "01234",
            "types": [
                "postal_code"
            ]
        }
    ]
}

Solution

  • To handle validation of an array of strings in the types property and ensure the presence of at least one of the required types (locality, country, postal_code), you can use a JSON schema that checks whether the types array contains at least one of these values. Here's a refined version of your schema:

    {
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "$id": "",
        "title": "Google APIs Geo Coding",
        "description": "Necessary address components.",
        "type": "object",
        "required": [
            "formatted_adress",
            "lon",
            "lat",
            "address_components"
        ],
        "properties": {
            "formatted_adress": {
                "type": "string"
            },
            "lon": {
                "type": "number"
            },
            "lat": {
                "type": "number"
            },
            "address_components": {
                "type": "array",
                "items": {
                    "type": "object",
                    "required": [
                        "long_name",
                        "short_name",
                        "types"
                    ],
                    "properties": {
                        "long_name": {
                            "type": "string"
                        },
                        "short_name": {
                            "type": "string"
                        },
                        "types": {
                            "type": "array",
                            "items": {
                                "type": "string"
                            }
                        }
                    }
                },
                "allOf": [
                    {
                        "contains": {
                            "properties": {
                                "types": {
                                    "type": "array",
                                    "contains": {
                                        "enum": [
                                            "locality"
                                        ]
                                    }
                                }
                            }
                        }
                    },
                    {
                        "contains": {
                            "properties": {
                                "types": {
                                    "type": "array",
                                    "contains": {
                                        "enum": [
                                            "country"
                                        ]
                                    }
                                }
                            }
                        }
                    },
                    {
                        "contains": {
                            "properties": {
                                "types": {
                                    "type": "array",
                                    "contains": {
                                        "enum": [
                                            "postal_code"
                                        ]
                                    }
                                }
                            }
                        }
                    }
                ]
            }
        }
    }
    

    Key points:

    • types as an array

      The types property is now correctly defined as an array of strings.

    • contains keyword

      Used to ensure that the array contains at least one element from the specified set (locality, country, postal_code).

    • allOf for multiple checks

      The schema ensures that the address_components array contains at least one object with locality, one with country, and one with postal_code as part of the types array.

    This schema will correctly validate that address_components includes the required address types.