Search code examples
javapostgresqljooq

Create a custom Aggregate Function with jOOQ


Context

I am working with jOOQ against a PostgreSQL database.
I want to use jsonb_object_agg(name, value) onto the resultset of a LEFT OUTER JOIN.

Problem

The join being an OUTER one, sometimes the name component of the aggregation function is simply null: that can't work. I would then go for:

COALESCE(
    json_object_agg(table.name, table.value) FILTER (WHERE table.name IS NOT NULL),
    '{}'
)::json

As of now, the code I use to call jsonb_object_agg is (not exactly, but boils down to) the following:

public static Field<?> jsonbObjectAgg(final Field<?> key, final Select<?> select) {
    return DSL.field("jsonb_object_agg({0}, ({1}))::jsonb", JSON_TYPE, key, select);
}

... where JSON_TYPE is:

private static final DataType<JsonNode> JSON_TYPE = SQLDataType.VARCHAR.asConvertedDataType(/* a custom Converter */);

Incomplete solution

I would love to leverage jOOQ's AggregateFilterStep interface, and in particular, to be able to use its AggregateFilterStep#filterWhere(Condition... conditions).

However, the org.jooq.impl.Function class that implements AggregateFilterStep (indirectly via AgregateFunction and ArrayAggOrderByStep) is restricted in visibility to its package, so I can't just recycle blindly the implementation of DSL#ArrayAggOrderByStep:

public static <T> ArrayAggOrderByStep<T[]> arrayAgg(Field<T> field) {
    return new org.jooq.impl.Function<T[]>(Term.ARRAY_AGG, field.getDataType().getArrayDataType(), nullSafe(field));
}

Attempts

The closest I got to something reasonable is... building my own coalesceAggregation function that specifically coalesces aggregated fields:

//                                  Can't quite use AggregateFunction there
//                                                   v   v
public static <T> Field<T> coalesceAggregation(final Field<T> agg, final Condition coalesceWhen, @NonNull final T coalesceTo) {
    return DSL.coalesce(DSL.field("{0} FILTER (WHERE {1})", agg.getType(), agg, coalesceWhen), coalesceTo);
}

public static <T> Field<T> coalesceAggregation(final Field<T> agg, @NonNull final T coalesceTo) {
    return coalesceAggregation(agg, agg.isNotNull(), coalesceTo);
}

... But I then ran into issues with my T type being JsonNode, where DSL#coalesce seems to CAST my coalesceTo to varchar.

Or, you know:

DSL.field("COALESCE(jsonb_object_agg({0}, ({1})) FILTER (WHERE {0} IS NOT NULL), '{}')::jsonb", JSON_TYPE, key, select)

But that'd be the very last resort: it'd feel like I'd merely be one step away from letting the user inject any SQL they want into my database 🙄

In short

Is there a way in jOOQ to "properly" implement one's own aggregate function, as an actual org.jooq.AgregateFunction?
I'd like to avoid having it generated by jooq-codegen as much as possible (not that I don't like it – it's just our pipeline that's horrible).


Solution

  • Starting with jOOQ 3.14.0

    The JSON_OBJECTAGG aggregate function is supported natively in jOOQ now:

    DSL.jsonObjectAgg(TABLE.NAME, TABLE.VALUE).filterWhere(TABLE.NAME.isNotNull());
    

    Support for the FILTER clause was added in jOOQ 3.14.8.

    Starting with jOOQ 3.14.8 and 3.15.0

    If jOOQ doesn't implement a specific aggregate function, you can now specify DSL.aggregate() to make use of custom aggregate functions.

    DSL.aggregate("json_object_agg", SQLDataType.JSON, TABLE.NAME, TABLE.VALUE)
       .filterWhere(TABLE.NAME.isNotNull());
    

    This was implemented with https://github.com/jOOQ/jOOQ/issues/1729

    Pre jOOQ 3.14.0

    There's a missing feature in the jOOQ DSL API, namely to create plain SQL aggregate functions. The reason why this is not available yet (as of jOOQ 3.11) is because there are a lot of delicate internals to specifying a vendor agnostic aggregate function that supports all of the vendor-specific options including:

    • FILTER (WHERE ...) clause (as you mentioned in the question), which has to be emulated using CASE
    • OVER (...) clause to turn an aggregate function into a window function
    • WITHIN GROUP (ORDER BY ...) clause to support ordered set aggregate functions
    • DISTINCT clause, where supported
    • Other, vendor-specific extensions to aggregate functions

    The easy workaround in your specific case is to use plain SQL templating all the way as you mentioned in your question as well:

    DSL.field("COALESCE(jsonb_object_agg({0}, ({1})) FILTER (WHERE {0} IS NOT NULL), '{}')::jsonb", JSON_TYPE, key, select)
    

    Or you do the thing you've mentioned previously. Regarding that concern:

    ... But I then ran into issues with my T type being JsonNode, where DSL#coalesce seems to CAST my coalesceTo to varchar.

    That's probably because you used agg.getType() which returns Class<?> instead of agg.getDataType() which returns DataType<?>.

    But that'd be the very last resort: it'd feel like I'd merely be one step away from letting the user inject any SQL they want into my database

    I'm not sure why that is an issue here. You will still be able to control your plain SQL API usage yourself, and users won't be able to inject arbitrary things into key and select because you control those elements as well.