Search code examples
pythonrestflaskflask-sqlalchemyflask-restful

Why does fields.Url('...') break in Flask-RESTful based on the number of rendered fields


I have the following small Flask-RESTful-based API:

from app import api, db, models
from flask import request
from flask.ext.restful import Resource, fields, marshal_with

foo_fields = {
    'id': fields.Integer,
    'name': fields.String,
    'self': fields.Url('.foo'),
    'created': fields.DateTime(dt_format='iso8601'),
    'updated': fields.DateTime(dt_format='iso8601'),
    'kind': fields.Raw('foo'),
}

class Foo(Resource):
    @marshal_with(foo_fields)
    def get(self, id):
        account = models.Foo.query.filter_by(id=id).first_or_404()
        return account

class FooList(Resource):
    @marshal_with(foo_fields)
    def get(self):
        accounts = models.Foo.query.order_by(models.Foo.id).all()
        return accounts

    @marshal_with(foo_fields)
    def post(self):
        json = request.get_json()
        foo = models.Foo()
        foo.name = json['name']
        db.session.add(foo)
        db.session.commit()
        return foo, 201

api.add_resource(Foo, '/foo/<int:id>', endpoint='foo')
api.add_resource(FooList, '/foo', endpoint='foo_list')

There is a small model which is backed using SQL Alchemy currently using SQLite for testing:

from app import app, db
from sqlalchemy.sql import func

class Foo(db.Model):
    __tablename__ = 'foo'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(64), index=True, unique=True)
    created = db.Column(db.DateTime, default=func.now())
    updated = db.Column(db.DateTime, default=func.now())

Now, I can use this to POST a new object like so:

$ curl -v -X POST localhost:5000/api/v1/foo -H 'Content-Type: application/json' -d '{ "name": "foo" }'
{
    "created": "2015-10-21T12:39:41", 
    "id": 1, 
    "kind": "foo", 
    "name": "foo", 
    "self": "/api/v1/foo/1", 
    "updated": "2015-10-21T12:39:41"
}

However if I reduce the number of fields used when rendering the output, either dropping the updated or kind fields suffices then when I POST another new object (and only when I POST) I get the following error:

BuildError: ('v1.foo', {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x1054a3d10>}, None)

This is only caused by virtue of still having the fields.Url('.foo') in the marshalled fields, if I remove that as well then everything works again. GET requests always work regardless.

I can also make things work by just accessing one of the fields in the object post-db.session.commit(), i.e. just print foo.id before the return.

Can someone explain why the code seems a bit fragile here? I can see that by accessing a field post-commit it's probably triggering a read from the database.


Solution

  • I think I've spotted the problem and I think it's basically down to the natural sort order of keys in Python dictionaries.

    If I disable returning sufficient keys the field containing the fields.Url('.foo') value becomes the first key when iterating over the dictionary and because the default behaviour of SQL Alchemy is to expire any instances after a session commit my new object created in the POST request temporarily becomes a None which explains the error when trying to compute the URL value.

    Accessing any other field, either by it appearing first when iterating over the keys of the dictionary, or by myself using the print foo.id hack is enough to trigger a read back from the database to refresh the object state and then computing the URL value works as expected.

    I suppose I can work around this by somehow setting expire_on_commit=False in the SQL Alchemy session object or using an OrderedDict and making sure the URL field is not first. I'm not sure what the former solution introduces in terms of side effects though.