Search code examples
pythonmongodbdatetimepymongomarshmallow

python, mongo and marshmallow: datetime struggles


I'm trying to do something pretty simple: get the current time, validate my object with marshmallow, store it in mongo

python 3.7

requirements:

datetime==4.3
marshmallow==3.5.1
pymongo==3.10.1

schema.py

from marshmallow import Schema, fields
...
class MySchema(Schema):
    user_id = fields.Str(required=True)
    user_name = fields.Str()
    date = fields.DateTime()
    account_type = fields.Str()
    object = fields.Raw()

preapredata.py

from datetime import datetime
from schema.py import Myschema
... 
        dt = datetime.now()
        x = dt.isoformat()
        data = {
            "user_id": '123123123',
            "user_name": 'my cool name',
            "date":  x,
            "account_type": 'another sting',
            "trade": {'some':'dict'}
        }
        # validate the schema for storage
        validator = MySchema().load(data)
        if 'errors' in validator:
            log.info('validator.errors')
            log.info(validator.errors)
...
        res = MyService().create(
            data
        )

myservice.py

    def create(self, data):
        log.info("in creating data service")
        log.info(data)

        self.repo.create(data)
        return MySchema().dump(data)

connector to mongo is fine, am saving other data that has no datetime with no issue. I seem to have gone through a hundred different variations of formatting the datetime before passing it to the date key, as well as specifying the 'format' option in the schema field both inline and in the meta class, example:

    #class Meta:
    #    datetimeformat = '%Y-%m-%dT%H:%M:%S+03:00'

Most variations I try result in:

{'date': ['Not a valid datetime.']}

i've finally managing to pass validation going in by using simply

 x = dt.isoformat()

and leaving the field schema as default ( date = fields.DateTime() )

but when i dump back through marshmallow i get

AttributeError: 'str' object has no attribute 'isoformat'

the record is created in mongo DB fine, but the field type is string, ideally I'd like to leverage the native mongo date field

if i try and pass

 datetime.now()

to the date, it fails with

{'date': ['Not a valid datetime.']}

same for

datetime.utcnow()

Any guidance really appreciated.


Edit: when bypassing marshmallow, and using either

datetime.now(pytz.utc)

or

datetime.utcnow() 

field data stored in mongo as expected as date, so the issue i think can be stated more succinctly as: how can i have marshmallow fields.DateTime() validate either of these formats?


Edit 2: so we have already begun refactoring thanks to Jérôme's insightful answer below. for anyone who wants to 'twist' marshmallow to behave like the original question stated, we ended up going with:

    date = fields.DateTime(
        #dump_only=True,
        default=lambda: datetime.utcnow(),
        missing=lambda: datetime.utcnow(),
        allow_none=False
    )

i.e. skip passing date at all, have marshmallow generate it from missing, which was satisfying our use case.


Solution

  • The point of marshmallow is to load data from serialized (say, JSON, isoformat string, etc.) into actual Python objects (int, datetime,...). And conversely to dump it from object to a serialized string.

    Marshmallow also provides validation on load, and only on load. When dumping, the data comes from the application and shouldn't need validation.

    It is useful in an API to load and validate data from the outside world before using it in an application. And to serialize it back to the outside world.

    If your data is in serialized form, which is the case when you call isoformat() on your datetime, then marshmallow can load it, and you get a Python object, with a real datetime in it. This is what you should feed pymongo.

        # load/validate the schema for storage
        try:
            loaded_data = MySchema().load(data)
        except ValidationError as exc:
            log.info('validator.errors')
            log.info(exc.errors)
        ...
        # Store object in database
        res = MyService().create(loaded_data)
    

    Since marshmallow 3, load always returns deserialized content and you need to try/catch validation errors.

    If your data does not come to your application in deserialized form (if it is in object form already), then maybe marshmallow is not the right tool for the job, because it does not perform validation on deserialized objects (see https://github.com/marshmallow-code/marshmallow/issues/1415).

    Or maybe it is. You could use an Object-Document Mapper (ODM) to manage the validation and database management. This is an extra layer other pymongo. umongo is a marshmallow-based mongoDB ODM. There are other ODMs out there: mongoengine, pymodm.

    BTW, what is this

    datetime==4.3
    

    Did you install DateTime? You don't need this.

    Disclaimer: marshmallow and umongo maintainer speaking.