Search code examples
google-app-enginegoogle-cloud-datastoreaclendpoints-proto-datastore

Row level access for google appengine datastore queries


I'm trying to develop row level access on google appengine datastore tables. So far I do have got a working example for regular ndb put(), get() and delete() operations using _hooks.

The class Acl shall be used by all the other tables. It's used as a structured property.

class Acl(EndpointsModel):
    UNAUTHORIZED_ERROR = 'Invalid token.'
    FORBIDDEN_ERROR = 'Permission denied.'

    public = ndb.BooleanProperty()
    readers = ndb.UserProperty(repeated=True)
    writers = ndb.UserProperty(repeated=True)
    owners = ndb.UserProperty(repeated=True)

    @classmethod
    def require_user(cls):
        current_user = endpoints.get_current_user()
        if current_user is None:
            raise endpoints.UnauthorizedException(cls.UNAUTHORIZED_ERROR)
        return current_user

    @classmethod
    def require_reader(cls, record):
        if not record:
            raise endpoints.NotFoundException(record.NOT_FOUND_ERROR)
        current_user = cls.require_user()
        if record.acl.public is not True or current_user not in record.acl.readers:
            raise endpoints.ForbiddenException(cls.FORBIDDEN_ERROR)

I do want to protect access to the Location class. So I did add three hooks (_post_get_hook, _pre_put_hook and _pre_delete_hook) to the class.

class Location(EndpointsModel):
    QUERY_FIELDS = ('state', 'limit', 'order', 'pageToken')
    NOT_FOUND_ERROR = 'Location not found.'

    description = ndb.TextProperty()
    address = ndb.StringProperty()
    acl = ndb.StructuredProperty(Acl)

    @classmethod
    def _post_get_hook(cls, key, future):
        location = future.get_result()
        Acl.require_reader(location)

    def _pre_put_hook(self):
        if self.key.id() is None:
            current_user = Acl.require_user()
            self.acl = Acl()
            self.acl.readers.append(current_user)
            self.acl.writers.append(current_user)
            self.acl.owners.append(current_user)
        else:
            location = self.key.get()
            Acl.require_writer(location)

This does work for all the create, read, update and delete operations, but it does not work for query.

@Location.query_method(user_required=True,
                       path='location', http_method='GET', name='location.query')
def location_query(self, query):
    """
    Queries locations
    """
    current_user = Acl.require_user()
    query = query.filter(ndb.OR(Location.acl.readers == current_user, Location.acl.public == True))
    return query

When I run a query against all locations I get the following error message:

BadArgumentError: _MultiQuery with cursors requires __key__ order

Now I've got some questions:

  • How do I fix the _MultiQuery issue?
  • Once fixed: Does this Acl implementation make sense? Are there out of the box alternatives? (I wanted to store the Acl on the record itself to be able to run a direct query, without having to get the keys first.)

Solution

  • Datastore doesn't support OR filters natively. Instead what NDB is doing behind the scenes is running two queries:

     query.filter(Location.acl.readers == current_user)
     query.filter(Location.acl.public == True)
    

    It then merges the results of these two queries into a single result set. In order to properly merge results (in particular to eliminate duplicates when you have repeated properties), the query needs to be ordered by the key when continuing the query from an arbitrary position (using cursors).

    In order to run the query successfully, you need to append a key order to the query before running it:

    def location_query(self, query):
    """
    Queries locations
    """
    current_user = Acl.require_user()
    query = query.filter(ndb.OR(Location.acl.readers == current_user,
                                Location.acl.public == True)
                        ).order(Location.key)
    return query
    

    Unfortunately, your ACL implementation will not work for queries. In particular, _post_get_hook is not called for query results. There is a bug filed on the issue tracker about this.