Search code examples
jooq

JOOQ: multisetAgg or toSet filtering out NULL


Quite often, the new feature of multisetAgg is used along with LEFT JOINs.

Let's say, I have a user as dimension table and fact table paid_subscriptions. I want to query a specific user with all of his paid subscriptions and for each subscription do some processing (like sending an email or whatever).

I would write some JOOQ like this:

ctx
                .select(row(
                        USER.ID,
                        USER.USERNAME,
                        multisetAgg(PAIDSUBSCRIPTIONS.SUBNAME).as("subscr").convertFrom(r -> r.intoSet(Record1::value1))
                        ).mapping(MyUserWithSubscriptionPOJO::new)
                )
                .from(USER)
                .leftJoin(PAIDSUBSCRIPTIONS).onKey()
                .where(someCondition)
                .groupBy(USER)
                .fetch(Record1::value1));

The problem here is: the multisetAgg produces a Set which can contain null as element.

I either heve to filter out the null subscriptions I don't care about after JOOQ select, or I have to rewrite my query with something like this:

multisetAgg(PAIDSUBSCRIPTIONS.SUBNAME).as("subscr").convertFrom(r -> {
                                    final Set<String> res = r.intoSet(Record1::value1);
                                    res.remove(null); // remove possible nulls
                                    return res;
                                })

Both don't look too nice in code.

I wonder if there is a better approach to write this with less code or even an automatic filtering of null values or some other kind of syntactic sugar avilable in JOOQ? After all, I think it is quite a common usecase especially considering that often enough, I end up with some java8 style stream processing of my left joined collection and first step is to filter out null which is something I forget often :)


Solution

  • You're asking for a few things here:

    • SET instead of MULTISET (will be addressed with #12033)
    • Adding NULL filtering (is already possible with FILTER)
    • The implied idea that such NULL values could be removed automatically (might be addressed with #13776)

    SET instead of MULTISET

    The SQL standard has some notions of a SET as opposed to MULTISET or ARRAY. For example:

    It isn't as powerful as MULTISET, and it doesn't have to be, because usually, just by adding DISTINCT you can turn any MULTISET into a SET. Nevertheless, Informix (possibly the most powerful ORDBMS) does have SET data types and constructors:

    So, we might add support for this in the future, perhaps. I'm not sure yet of its utility, as opposed to using DISTINCT with MULTISET (already possible) or MULTISET_AGG (possible soon):

    Adding NULL filtering

    You already have the FILTER clause to do this directly in SQL. It's a SQL standard and supported by jOOQ natively, or via CASE emulations. A native SQL example, as supported by e.g. PostgreSQL:

    select 
      t.a, 
      json_agg(u.c),
      json_agg(u.c) filter (where u.b is not null)
    from (values (1), (2)) t (a)
    left join (values (2, 'a'),(2, 'b'),(3, 'c'),(3, 'd')) u (b, c) on t.a = u.b
    group by t.a
    

    Producing:

    |a  |json_agg  |json_agg  |
    |---|----------|----------|
    |1  |[null]    |          |
    |2  |["a", "b"]|["a", "b"]|
    

    So, just write:

    multisetAgg(PAIDSUBSCRIPTIONS.SUBNAME).filter(PAIDSUBSCRIPTIONS.ID.isNotNull())
    

    The implied idea that such NULL values could be removed automatically

    Note, I understand that you'd probably like this to be done automatically. There's a thorough discussion on that subject here: #13776. As always, it's a desirable thing that is far from easy to implement consistently.

    I'm positive that this will be done eventually, but it's a very big change.