I am querying (via sqlalchemy) my_table with a conditional on a column and then retrieve distinct values in another column. Quite simply
selection_1 = session.query(func.distinct(my_table.col2)).\
filter(my_table.col1 == value1)
I need to do this repeatedly to get distinct values from different columns from my_table.
selection_2 = session.query(func.distinct(my_table.col3)).\
filter(my_table.col1 == value1).\
filter(my_table.col2 == value2)
selection_3 = session.query(func.distinct(my_table.col4)).\
filter(my_table.col1 == value1).\
filter(my_table.col2 == value2).\
filter(my_table.col3 == value3)
The above code works, but as I need to have 6 successive calls it's getting a bit out of hand. I have created a class to handle the method chaining:
class QueryProcessor:
def add_filter(self, my_table_col, value):
filter(my_table_col == value)
return self
def set_distinct_col(self, my_other_table_col):
self.my_other_table_col = my_other_table_col
return session.query(func.distinct(self.my_other_table_col))
Ideally I'd be able to use the class like
selection_1 = QueryProcessor().set_distinct_col(my_table.col2).add_filter(my_table.col1, value1)
selection_2 = selection_1.set_distinct_col(my_table.col3).add_filter(my_table.col2, value2)
selection_3 = selection_2.set_distinct_col(my_table.col4).add_filter(my_table.col3, value3)
but when I run
selection_1 = QueryProcessor().set_distinct_col(my_table.col2).add_filter(my_table.col1, value1)
I get the following error:
Traceback (most recent call last):
File " ... "
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-20-789b26eccbc5>", line 10, in <module>
selection_1 = QueryProcessor().set_distinct_col(my_table.col2).add_filter(my_table.col1, value1)
AttributeError: 'Query' object has no attribute 'add_filter'
Any help will be much welcomed.
You don't really need a special class for this. Your existing code
selection_2 = session.query(func.distinct(my_table.col3)).\
filter(my_table.col1 == value1).\
filter(my_table.col2 == value2)
works because filter
is returning a new query based on the original query, but with an additional filter added to it. You can just iterate over the columns and their corresponding values, replacing each old query with its successor.
selection2 = session.query(func.distinct(my_table.col3))
for col, val in zip([my_table.col1, my_table.col2], [value1, value2]):
selection2 = selection2.filter(col == val)
selection_3 = session.query(func.distinct(my_table.col4))
for col, val in zip([mytable.col1, mytable.col2, mytable.col3],
[value1, value2, value3]):
selection_3 = selection_3.filter(col == val)
That said, the problem with your code is that add_filter
doesn't actually call the query's filter
method, or update the wrapped query.
class QueryProcessor:
def set_distinct_col(self, my_other_table_col):
self.query = session.query(func.distinct(self.my_other_table_col))
return self
def add_filter(self, my_table_col, value):
self.query = self.query.filter(my_table_col == value)
return self
This poses a problem, though: set_distinct_col
creates a new query, so it doesn't really make sense in the following
selection_1 = QueryProcessor().set_distinct_col(my_table.col2).add_filter(my_table.col1, value1)
selection_2 = selection_1.set_distinct_col(my_table.col3).add_filter(my_table.col2, value2)
selection_3 = selection_2.set_distinct_col(my_table.col4).add_filter(my_table.col3, value3)
to call set_distinct_col
on an existing instance. It can return either a new query or the existing one, but not both (at least, not if you want to do chaining).
Also, note that selection_1
itself is not the query, but selection_1.query
.