Search code examples
jooq

JOOQ multiple joins on same table performance


Using jooq 3.19 with implicit join paths multiple joins are generated on same table.

  1. Is this expected?

  2. Does multiple joins on same table introduce performance issues?

Example:

DSL.using( SQLDialect.SQLSERVER2017, new Settings().withRenderImplicitJoinToManyType( RenderImplicitJoinType.LEFT_JOIN ) )
.select( Entity.ENTITY.ID ).from( Entity.ENTITY )
.where( Entity.ENTITY.subEntity()
.entity().subEntity()
.entity().subEntity()
.entity().subEntity()
.entity().NAME.eq( "a" ) )
.getSQL()

it generates the following sql:

"select [ENTITY].[ID] from ([ENTITY] 
left outer join ([SUB_ENTITY] [alias_87004649] 
left outer join ([SUB_ENTITY] [alias_112570288] 
left outer join ([SUB_ENTITY] [alias_55342369] 
left outer join ([SUB_ENTITY] [alias_126614199] 
join [ENTITY] [alias_74449122] 
on [alias_126614199].[ENTITY_ID] = [alias_74449122].[ID]) 
on [alias_126614199].[ENTITY_ID] = [alias_55342369].[ID]) 
on [alias_55342369].[ENTITY_ID] = [alias_112570288].[ID]) 
on [alias_112570288].[ENTITY_ID] = [alias_87004649].[ID]) 
on [alias_87004649].[ENTITY_ID] = [ENTITY].[ID]) 
where [alias_74449122].[NAME] = ?"

Solution

  • Is this expected?

    Yes it is.

    While at first, it looks like navigating the same foreign key path again back and forth is meaningless, and joins could be eliminated, they really cannot. Every to-many path segment can create a cartesian product with the previous segments, just like any ordinary, explicit to-many join can do that. Hence, due to the slight chance of such a cartesian product being desired, jOOQ can't prevent it without altering query logic, so if you tell jOOQ to re-join the same table many times (even if you don't intend to), then jOOQ must re-join the same table many times in order to implement your potentially desired query logic.

    Do note that the jumps back to the parent are in fact eliminated, because eliminating them doesn't alter query semantics in your case, as no parent column is being projected:

    Does multiple joins on same table introduce performance issues?

    Of course! Not only does this produce performance issues, you're also getting wrong results. You seem to have turned on Settings.renderImplicitJoinToManyType just to "get it to work," without thinking about the implications. But the implications are precisely what I said. Your WHERE clause now produces a cartesian product, which I doubt you want nor expect.

    The manual's section about implicit to-many path joins is clear about this

    Accidental duplicate objects isn't the main problem that such implicit to-many path joins would cause. The main problem is that an implicit to-many path placed in the SELECT clause or WHERE clause (and other clauses) would be able to generate rows, when in fact SELECT only transforms rows (like Stream.map()) and WHERE only filters rows (like Stream.filter()). It would be very SQL-unidiomatic and confusing for these clauses to be able to effectively produce rows.

    In your particular case, you probably want to move the predicate into an EXISTS subquery using join path correlation, e.g.

    ctx
      .select(ENTITY.ID)
      .from(ENTITY)
      .where(exists(
        selectOne()
        .from(ENTITY.subEntity()
          ...
          .entity().NAME.eq("a")
        )
      ))
    .getSQL()
    

    This doesn't require enabling the above controversial Settings.renderImplicitJoinToManyType. You'll still have your weird path and the unnecessary joins, but at least, you won't get any cartesian products now.

    Getting rid of cycles

    You know your own RQL to jOOQ translation semantics best. If you know you can remove these cycles, then best do this yourself. I hope the above answer clarified why jOOQ cannot do this for you, automatically.