Search code examples
ponyorm

Pony ORM JOIN syntax


I've started using Pony and haven't actually understood how to use joins yet. In examples I saw when left_join() was used with two for clauses, but when I try to repeat it in my code, I get error like "collection was expected, got "for p in Pond"" Maybe somebody could explain how to use it or point me to docs page where it's already explained?


Solution

  • Let's say we have the following entities:

    from pony import orm
    
    db = orm.Database()
    
    class Person(db.Entity):
        id = orm.PrimaryKey(int, auto=True)
        name = orm.Required(str)
        age = orm.Required(int)
        contacts = orm.Set('Contact')
    
    class Contact(db.Entity):
        id = orm.PrimaryKey(int, auto=True)
        person = orm.Required('Person')
        type = orm.Required(str)
        value = orm.Required(str)
    
    db.generate_mapping(create_tables=True)
    
    with orm.db_session:
        john = Person(name='John', age=23)
        mary = Person(name='Mary', age=21)
        mike = Person(name='Mike', age=18)
        arthur = Person(name='Arthur', age=25)
    
        john.contacts.create(type='mobile', value='1234567')
        john.contacts.create(type='email', value='[email protected]')
    
        mary.contacts.create(type='mobile', value='76543321')
        mary.contacts.create(type='skype', value='mary123')
    
        mike.contacts.create(type='mobile', value='2345678')
    

    Now we want to print person name and a contact info for each person older then 20. There are several ways how we can do it.


    The first way is when we explicitly state the join condition. This way is pretty verbose:

    query = orm.select(
        (p.name, c.value)
        for p in Person for c in Contact
        if p.age > 20 and c.person == p
    )
    query.show()
    

    In this query we explicitly state the join condition: c.person == p. The query will show us the following result:

    p.name|c.type|c.value
    ------+------+----------------
    John  |email |[email protected]
    John  |mobile|1234567
    Mary  |mobile|76543321
    Mary  |skype |mary123
    

    As you can see, Arthur was not included into result, albeit his age is greater than 20. This is because this type of join is inner join, and the result includes only persons for which it was possible to find at least one contact.


    The second way of joining is when we loop over collection attribute:

    query = orm.select(
        (p.name, c.value)
        for p in Person for c in p.contacts
        if p.age > 20
    )
    query.show()
    

    This type of joins is used most often. It is very convenient, because we don't need to explicitly specify the join condition. The result of query is the same as before:

    p.name|c.type|c.value
    ------+------+----------------
    John  |email |[email protected]
    John  |mobile|1234567
    Mary  |mobile|76543321
    Mary  |skype |mary123
    

    Arthur is still not in the list for the same reason as before. If we want to include Arthur into result, we need to use other type of join, namely, left join:

    query = orm.left_join(
        (p.name, c.value)
        for p in Person for c in p.contacts
        if p.age > 20
    )
    query.show()
    

    In that case the result of query includes Arthur with the None value instead of the phone number:

    p.name|c.type|c.value
    ------+------+----------------
    Arthur|None  |None
    John  |email |[email protected]
    John  |mobile|1234567
    Mary  |mobile|76543321
    Mary  |skype |mary123
    

    When you use left_join you need to loop over collection. In that case Pony adds the join condition into the ON section of LEFT JOIN clause of SQL command.

    You cannot do explicit join as in the very first query if you use left_join, because in that case Pony does not know which condition is to put into ON section of LEFT JOIN clause.

    Sometimes it may be useful to specify content of the ON section manually. Right now Pony does not support such queries, but this feature may be added in the future.


    When using PonyORM in many cases it is possible to retrieve data without making joins at all. For example, you can write the following loop to print person name and phone number:

    with db_session:
        for p in Person.select(lambda p: p.age > 20):
            print(p.name)
            for c in p.contacts:
                print(c.type, c.value)
    

    In other ORMs this will lead to "N+1 query" problem, where the contacts of each person are retrieved by separate SQL query. Pony tries to automatically optimize queries to avoid "N+1 query" pattern.


    In some cases the joins are implicit. For example, to find all contacts of a person whose name is started with 'M', you can write:

    query = select(c for c in Contact if c.person.name.startswith('M'))
    for c in query:
        print(c.person.name, c.type, c.value)
    

    In that case the Person table is joined implicitly, just because you perform attribute traversing from Contact to Person.