Is there a difference between a marshmallow.Schema
(v3.0+) where a field with name foo
is defined with attribute="bar"
, and another where a field with name bar
is defined with data_key="foo"
?
Both seem to serialize and deserialize dictionaries and other simple objects in the same way:
import marshmallow
class MySchema1(marshmallow.Schema):
foo = marshmallow.fields.String(attribute="bar")
class MySchema2(marshmallow.Schema):
bar = marshmallow.fields.String(data_key="foo")
schema1 = MySchema1()
schema2 = MySchema2()
dictionary = {"bar": "hello world"}
assert schema1.dump(dictionary) == schema2.dump(dictionary)
assert schema1.load(schema1.dump(dictionary)) == schema2.load(schema2.dump(dictionary))
class Thingy:
def __init__(self):
self.bar = "hello world"
instance = Thingy()
assert schema1.dump(instance) == schema2.dump(instance)
assert schema1.load(schema1.dump(instance)) == schema2.load(schema2.dump(instance))
The above passes. This isn't causing any errors in my project currently, but I am curious what the difference is! Thanks in advance.
You are right. Both schemas behave identically.
This could be seen as redundant API: from your example, you could wonder why have attribute
if data_key
provides the same functionality.
In practice, it is useful to have both, because it allows to specify keys that are invalid python variable names for both load and dump keys.
class MySchema(marshmallow.Schema):
foo = marshmallow.fields.String(attribute="bar-load", data_key="bar-dump")
AFAIK, this is the reason we didn't drop attribute
in marshmallow. There may be other reasons, but this one seems like a good one already.