OpenJPA distinct on

In my DB schema I have conversations with several emails. I want to get the newest emails from a list of conversations. In PostgreSql the query:

select distinct on (conversation_id) * 
from email 
where conversation_id in (7085214, 7084964) 
order by conversation_id, processing_date desc

OpenJPA:

(List<Email>) entityManager.createQuery("SELECT
   distinct(email.conversation.id), email FROM Email email WHERE
      email.conversation.id in :id ORDER BY email.conversation.id,
      email.processingDate DESC").setParameter("id", conversationIds);

It gives back a map of the conversation ids and the whole list of emails in the conversations.

How could I make it right?

Thanks

Solution

Use native SQL.

The only other way to do what you want is to develop a patch to OpenJPA that "teaches" it how to use the PostgreSQL extension DISTINCT ON in its JPQL parser and query generator. Most ORMs accept such extensions via dialect hooks. Don't expect this to be a simple task, though - unless you're writing a lot of these queries, native SQL is almost certain to be much easier.

You can't just use DISTINCT or DISTINCT ON like functions. They aren't; they're completely separate syntax. A JPQL engine would try to convert it into a true function call that'd fail at runtime - or in the case of distinct on, just fail to parse it in the first place.

BTW, DISTINCT ON is a bit like GROUP BY in some other vendor databases like MySQL, where you're allowed to specify columns in the SELECT that don't appear in the GROUP BY or an aggregate. So in MySQL people probably do this by just producing a technically invalid query that MySQL accepts anyway - it's quite likely that the OpenJPA JPQL handler won't notice the problem, so it'll pass it through fine. This trick won't work for DISTINCT ON and PostgreSQL is strictly standards compliant about GROUP BY, it won't let you produce a non-deterministic query with GROUP BY.