I've the following self-referential (tree) node, and wish to filter/sort by the calculated properties uuid_path
and name_path
:
class Node (db.Model):
id = db.Column (db.Integer, db.Sequence ('node_id_seq'), primary_key=True)
###########################################################################
root_id = db.Column (db.Integer, db.ForeignKey (id, ondelete='CASCADE'),
index=True)
nodes = db.relationship ('Node',
cascade='all, delete-orphan', lazy='dynamic',
primaryjoin='Node.id==Node.root_id',
backref=db.backref ('root', remote_side=id))
###########################################################################
_uuid = db.Column (db.String (36), nullable=False, index=True, unique=True,
name = 'uuid')
_name = db.Column (db.Unicode (256), nullable=False, index=True,
name = 'name')
###########################################################################
@hybrid_property
def uuid (self):
return self._uuid
@hybrid_property
def name (self):
return self._name
@name.setter
def name (self, value):
self._name = value
###########################################################################
def __init__ (self, name, root, mime=None, uuid=None):
self.root = root
self._uuid = uuid if uuid else str (uuid_random ())
self._name = unicode (name) if name is not None else None
def __repr__ (self):
return u'<Node@%x: %s>' % (self.id if self.id else 0, self._name)
###########################################################################
@hybrid_property
def uuid_path (self):
node, path = self, []
while node:
path.insert (0, node.uuid)
node = node.root
return os.path.sep.join (path)
@hybrid_property
def name_path (self):
node, path = self, []
while node:
path.insert (0, node.name)
node = node.root
return os.path.sep.join (path)
###########################################################################
If I get a Node
instance subnode
and execute e.g. subnode.name_path
then I get correctly e.g. root/subnode
. But if I try to use Node.name_path
(for filtering/sorting) then SQLAlchemy complains:
Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Node.root has an attribute 'name'.
I'm pretty sure I've to introduce something like:
class Node (db.Model):
@hybrid_property
def name_path (self):
node, path = self, []
while node:
path.insert (0, node.name)
node = node.root
return os.path.sep.join (path)
@name_path.expression
def name_path (cls):
## Recursive SQL expression??
But I struggle to get a correct definition for @name_path.expression
(or @uuid_path.expression
); it should somehow instruct SQL to deliver the path from the root node down to the node in question.
What I don't understand is why this is required, since I've told SQLAlchemy to iteratively calculate the path values. Thanks for your help.
All right after tweaking with PostgreSQL and SQLAlchemy, I think I've a solution: (1) First, I'd to write the query as a function in SQL, and (2) second provide the correct SQLAlchemy glue:
The SQL part uses a WITH RECURSIVE
CTE:
CREATE OR REPLACE FUNCTION name_path(node)
RETURNS text AS
$BODY$
WITH RECURSIVE graph (id, root_id, id_path, name_path) AS (
SELECT n.id, n.root_id, ARRAY[n.id], ARRAY[n.name]
FROM node n
UNION
SELECT n.id, n.root_id, id_path||ARRAY[n.id], name_path||ARRAY[n.name]
FROM node n, graph g
WHERE n.root_id = g.id)
SELECT array_to_string (g.name_path, '/','.*')
FROM graph g
WHERE (g.id_path[1] = $1.base_id OR g.root_id IS NULL)
AND (g.id = $1.id)
$BODY$
LANGUAGE sql STABLE
COST 100;
ALTER FUNCTION name_path(node)
OWNER TO webed;
and the SQLAlchemy side looks like this:
class NamePathColumn (ColumnClause):
pass
@compiles (NamePathColumn)
def compile_name_path_column (element, compiler, **kwargs):
return 'node.name_path' ## something missing?
and
class Node (db.Model):
def get_path (self, field):
@cache.version (key=[self.uuid, 'path', field])
def cached_path (self, field):
if self.root:
return self.root.get_path (field) + [getattr (self, field)]
else:
return [getattr (self, field)]
if field == 'uuid':
return cached_path (self, field)
else:
return cached_path.uncached (self, field)
@hybrid_property
def name_path (self):
return os.path.sep.join (self.get_path (field='name'))
@name_path.expression
def name_path (cls):
return NamePathColumn (cls)
I avoid accessing Node.name_path
on the DB if I'm on the pure Python side, but probably it would be faster if I would. The only thing I'm still not so sure about is in compile_name_path_column
I do not consider any of the element, compiler, **kwargs
parameters, which makes me a little suspicious.
I just cooked this up, after tinkering for about 1.5 days with SA & PG, so it's very much possible that there is still room for improvement. I'd very much appreciate any remarks w.r.t. this approach. Thanks.