python tree sqlalchemy hybrid self-reference

SQLAlchemy: Recursive hybrid property in a tree node

I've the following self-referential (tree) node, and wish to filter/sort by the calculated properties uuid_path and name_path:

class Node (db.Model):
    id = db.Column (db.Integer, db.Sequence ('node_id_seq'), primary_key=True)

    ###########################################################################

    root_id = db.Column (db.Integer, db.ForeignKey (id, ondelete='CASCADE'),
        index=True)

    nodes = db.relationship ('Node',
        cascade='all, delete-orphan', lazy='dynamic',
        primaryjoin='Node.id==Node.root_id',
        backref=db.backref ('root', remote_side=id))

    ###########################################################################

    _uuid = db.Column (db.String (36), nullable=False, index=True, unique=True,
        name = 'uuid')
    _name = db.Column (db.Unicode (256), nullable=False, index=True,
        name = 'name')

    ###########################################################################

    @hybrid_property
    def uuid (self):
        return self._uuid

    @hybrid_property
    def name (self):
        return self._name
    @name.setter
    def name (self, value):
        self._name = value

    ###########################################################################

    def __init__ (self, name, root, mime=None, uuid=None):

        self.root = root
        self._uuid = uuid if uuid else str (uuid_random ())
        self._name = unicode (name) if name is not None else None

    def __repr__ (self):

        return u'<Node@%x: %s>' % (self.id if self.id else 0, self._name)

    ###########################################################################

    @hybrid_property
    def uuid_path (self):
        node, path = self, []
        while node:

            path.insert (0, node.uuid)
            node = node.root

        return os.path.sep.join (path)

    @hybrid_property
    def name_path (self):
        node, path = self, []
        while node:

            path.insert (0, node.name)
            node = node.root

        return os.path.sep.join (path)

    ###########################################################################

If I get a Node instance subnode and execute e.g. subnode.name_path then I get correctly e.g. root/subnode. But if I try to use Node.name_path (for filtering/sorting) then SQLAlchemy complains:

Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Node.root has an attribute 'name'.

I'm pretty sure I've to introduce something like:

class Node (db.Model):

    @hybrid_property
    def name_path (self):
        node, path = self, []
        while node:

            path.insert (0, node.name)
            node = node.root

        return os.path.sep.join (path)

    @name_path.expression
    def name_path (cls):
        ## Recursive SQL expression??

But I struggle to get a correct definition for @name_path.expression (or @uuid_path.expression); it should somehow instruct SQL to deliver the path from the root node down to the node in question.

What I don't understand is why this is required, since I've told SQLAlchemy to iteratively calculate the path values. Thanks for your help.

Solution

All right after tweaking with PostgreSQL and SQLAlchemy, I think I've a solution: (1) First, I'd to write the query as a function in SQL, and (2) second provide the correct SQLAlchemy glue:

The SQL part uses a WITH RECURSIVE CTE:

CREATE OR REPLACE FUNCTION name_path(node)
  RETURNS text AS
$BODY$

WITH RECURSIVE graph (id, root_id, id_path, name_path) AS (
    SELECT n.id, n.root_id, ARRAY[n.id], ARRAY[n.name]
    FROM node n
UNION
    SELECT n.id, n.root_id, id_path||ARRAY[n.id], name_path||ARRAY[n.name]
    FROM node n, graph g
    WHERE n.root_id = g.id)

SELECT array_to_string (g.name_path, '/','.*')
FROM graph g
WHERE (g.id_path[1] = $1.base_id OR g.root_id IS NULL)
AND (g.id = $1.id)

$BODY$
  LANGUAGE sql STABLE
  COST 100;
ALTER FUNCTION name_path(node)
  OWNER TO webed;

and the SQLAlchemy side looks like this:

class NamePathColumn (ColumnClause):
    pass

@compiles (NamePathColumn)
def compile_name_path_column (element, compiler, **kwargs):
    return 'node.name_path' ## something missing?

and

class Node (db.Model):

    def get_path (self, field):

        @cache.version (key=[self.uuid, 'path', field])
        def cached_path (self, field):

            if self.root:
                return self.root.get_path (field) + [getattr (self, field)]
            else:
                return [getattr (self, field)]

        if field == 'uuid':
            return cached_path (self, field)
        else:
            return cached_path.uncached (self, field)

    @hybrid_property
    def name_path (self):
        return os.path.sep.join (self.get_path (field='name'))

    @name_path.expression
    def name_path (cls):
        return NamePathColumn (cls)

I avoid accessing Node.name_path on the DB if I'm on the pure Python side, but probably it would be faster if I would. The only thing I'm still not so sure about is in compile_name_path_column I do not consider any of the element, compiler, **kwargs parameters, which makes me a little suspicious.

I just cooked this up, after tinkering for about 1.5 days with SA & PG, so it's very much possible that there is still room for improvement. I'd very much appreciate any remarks w.r.t. this approach. Thanks.