Search code examples
pythonsortinglambdaponyorm

Pony ORM - Order by specific order


Performing a Pony ORM query and attempting to sort the query by three attributes that live on the model. First by song type, which can be one of the five values listed in ssf_type_order_map, then by duration (int), and uuid (string).

For song type, I would like to have the songs sorted in the following order: Full, Full (Instrumental), Shorts, Loops, Stems

If I attempt to sort using the following .order_by() call, it doesn't return any errors but it doesn't sort by the type as I need it in the aforementioned order (duration and UUID sorting works fine though).

song_source_files = self.song_source_files.select(lambda ssf: True).order_by(lambda ssf: (ssf.type, ssf.duration, ssf.uuid))

This is what I would think would be the ideal query, map the string types to a map that ranks their ordering.

ssf_type_order_map = {
    'Full': 1,
    'Full (Instrumental)': 2,
    'Shorts': 3,
    'Loops': 4,
    'Stems': 5
}

song_source_files = self.song_source_files.select(lambda ssf: True).order_by(lambda ssf: (ssf_type_order_map[ssf.type], ssf.duration, ssf.uuid))

But I get an error when running this saying "Expression ssf_type_order_map has unsupported type 'dict'.

The Pony ORM docs on order_by here, are very vague on using lambdas in this context.

Update - Sept 7th

I've also tried adding the following getter property on the model as follows:

@property
def source_type(self):
  ssf_type_order_map = {
    'Full': 1,
    'Full (Instrumental)': 2,
    'Shorts': 3,
    'Loops': 4,
    'Stems': 5
  }

  return ssy_type_order_map[self.type]

I then try ordering the query as follows:

song_source_files = self.song_source_files.select(lambda ssf: True).order_by(lambda ssf: (ssf_type_order_map[ssf.type], ssf.duration, ssf.uuid))

But I receive an error basically saying that the model does not have this property. My assumption based off of a similar issue with Django's ORM is that you can only access attributes that are present in the database models.

If that is the case with Pony as well, how does one pull off something that I would like to accomplish?


Solution

  • At first, I want to say that Pony distinguishes two types of sub-expressions: external expressions and correlated expressions. External expressions don't depend on the value of a generator loop variable(s), while correlated expressions do. Consider the following example:

    from some_module import f, g
    
    x = 100
    query = select(e for e in MyEntity if e.attr > f(x, 200) and g(x, e))
    

    In this query we have two sub-expressions: the first is f(x, 200) and the second is g(x, e). The former is considered by Pony as external expression because it doesn't use any loop variable. In that case Pony assumes that it is possible to calculate the value of the expression in Python before query execution, and then translate the expression into a single parameter. For such expressions Pony doesn't impose any restrictions on which Python functions can be used inside them, because a result of such an expression is just a single value evaluated in Python.

    The second expression g(x, e) cannot be evaluated in Python, because it depends on the value of the loop variable e. The result of such expression may be different for different table rows. Therefore, Pony needs to translate such expressions to SQL. Not every Python expression can be translated to SQL, and g needs to be a function which Pony specifically know how to translate. Pony defines a subset of Python operations which can be translated. This subset includes arithmetic operations on numeric types, string methods such as startswith, endswith, in, etc., and aggregated functions such as sum and max.

    In your code, when you write

    .order_by(lambda ssf: (ssf_type_order_map[ssf.type], ssf.duration, ssf.uuid))
    

    the expression ssf_type_order_map[ssf.type] refers to the object variable ssf, and hence will have different values for each table row, so this is correlated expression and Pony needs to translate that expression into SQL. Currently Pony doesn't understand how to perform such specific translation, but in principle this is doable. The result of translation will the following SQL CASE statement:

    ORDER BY CASE ssf.type
        WHEN 'Full' THEN 1
        WHEN 'Full (Instrumental)' THEN 2
        WHEN 'Shorts' THEN 3
        WHEN 'Loops' THEN 4
        WHEN 'Stems' THEN 5
        ELSE 0
    END
    

    The good news is that you can write such expression in Pony using Python if-expression syntax:

    (1 if ssf.type == 'Full' else
     2 if ssf.type == 'Full (Instrumental)' else
     3 if ssf.type == 'Shorts' else
     4 if ssf.type == 'Loops' else
     5 if ssf.type == 'Stems' else 0)
    

    At this moment Pony does not support decompiling if-expressions yet, so if you attempt to write such code directly, you will get an exception. As a workaround you need to pass the source of a lambda function as a string. In this case it will be translated just right, because we can directly parse the string to AST without decompiling. So you can write:

    song_source_files = self.song_source_files.select().order_by("""
        lambda ssf: ((1 if ssf.type == 'Full' else
                      2 if ssf.type == 'Full (Instrumental)' else
                      3 if ssf.type == 'Shorts' else
                      4 if ssf.type == 'Loops' else
                      5 if ssf.type == 'Stems' else 0),
                     ssf.duration, ssf.uuid)
    """)
    

    This should work perfectly, but I'd recommend to solve this problem in another way: we can have the SourceFileType entity with the name and code attributes and then order ssf records by ssf.type.code value:

    class SongSourceFile(db.Entity):
        name = Required(str)
        type = Required(lambda: SourceFileType)
        duration = Required(timedelta)
        uuid = Required(uuid.UUID, unique=True, default=uuid.uuid4)
    
    class SourceFileType(db.Entity):
        name = Required(str)
        code = Required(int)
        files = Set(lambda: SongSourceFile)
    

    Then it becomes possible writing the query in the following way:

    song_source_files = self.song_source_files.select().order_by(
        lambda ssf: (ssf.type.code, ssf.duration, ssf.uuid)
    )
    

    I think this approach is more universal, because now you can add other useful attributes to SourceFileType besides name and code and use them in queries too.