I have a set of documents and schemas I am doing validation against (shocker).
These documents are JSON messages from various different clients that use various different formats, thus a schema is defined for each document/message received from these clients.
I want to use a dispatcher
(dictionary with function calls as values) to help perform the mapping/formatting of a document after it is validated against a matching schema.
Once I know the schema a message is valid against, I can then create the desired message payload for my various consumer services by calling the requisite mapping function.
To this end I need a key in my dispatcher which uniquely maps to its respective mapping function for that schema. The key also needs to be used to identify a schema so the correct mapping function can be called.
My question is this: Is there a way to embed a config value like a numeric ID into a schema?
I want to take this schema:
schema = {
"timestamp": {"type": "number"},
"values": {
"type": "list",
"schema": {
"type": "dict",
"schema": {
"id": {"required": True, "type": "string"},
"v": {"required": True, "type": "number"},
"q": {"type": "boolean"},
"t": {"required": True, "type": "number"},
},
},
},
}
And add a schema_id
like this:
schema = {
"schema_id": 1,
"timestamp": {"type": "number"},
"values": {
"type": "list",
"schema": {
"type": "dict",
"schema": {
"id": {"required": True, "type": "string"},
"v": {"required": True, "type": "number"},
"q": {"type": "boolean"},
"t": {"required": True, "type": "number"},
},
},
},
}
So after successful validation, a link between message
/document
, to the schema via schema_id
to the resulting mapping_function
in the dispatcher is created.
Something like this:
mapping_dispatcher = {1: map_function_1, 2: map_function_2...}
if Validator.validate(document, schema) is True:
id = schema["schema_id"]
formatted_message = mapping_dispatcher[id](document)
A last ditch effort could be to simply stringify the json schemas and use those as keys but I'm not sure how I feel about that (it feels clever but wrong)...
I could also be going about this all wrong and there's a smarter way to do it.
Thanks!
small update
I've hacked around it by stringifying the schema, converting to bytes, then hex, then adding the integer values together like so:
schema_id = 0
bytes_schema = str.encode(schema)
hex_schema = codecs.encode(bytes_schema, "hex")
for char in hex_schema:
schema_id += int(char)
>>>schema_id
36832
So instead of a hash function I just embedded the schema in another json object that held the info like so:
[
{
"schema_id": "3",
"schema": {
"deviceName": {
"type": "string"
},
"tagName": {
"required": true,
"type": "string"
},
"deviceID": {
"type": "string"
},
"success": {
"type": "boolean"
},
"datatype": {
"type": "string"
},
"timestamp": {
"required": true,
"type": "number"
},
"value": {
"required": true,
"type": "number"
},
"registerId": {
"type": "string"
},
"description": {
"type": "string"
}
}
}
]
Was overthinking it I guess.