Search code examples
postgresqljoinaggregate-functionsuuid

How to select minimum UUID with left outer join?


I'm trying to select a row from a table which:

  1. has a minimum UUID
  2. is not referenced in another table

But I'm having problems when I try to enforce the first constraint.

Here's everything working as expected on integers: First, create tables that look like this:

t1
+----+---------+
| id | content |
+----+---------+
|  1 | a       |
|  2 | b       |
|  3 | c       |
+----+---------+

and

t2
+----+---------+
| id | t1_id   |
+----+---------+
|  1 | 1       |
+----+---------+
postgres=# create table t1(id int, content varchar(10), primary key (id));
CREATE TABLE
postgres=# create table t2(id int, t1_id int, foreign key (t1_id) references t1(id));
CREATE TABLE
postgres=# insert into t1 values (1, 'a');
INSERT 0 1
postgres=# insert into t1 values (2, 'b');
INSERT 0 1
postgres=# insert into t1 values (3, 'c');
INSERT 0 1
postgres=# insert into t2 values (1, 1);
INSERT 0 1

Now, I want to select the row in t1 with the lowest id which doesn't appear as a foreign key in t2. I want to select the row in t1 which has id = 2 and it works as expected:

postgres=# select min(t1.id) from t1 left outer join t2 on t1.id = t2.t1_id where t2.id is null;
 min
-----
   2
(1 row)

However, when I try the same with UUIDs, the final query fails to return anything at all. Note, I've used the answer from this post to define a way to find minimum UUIDs:

CREATE OR REPLACE FUNCTION min(uuid, uuid)
RETURNS uuid AS $$
BEGIN
    IF $2 IS NULL OR $1 > $2 THEN
        RETURN $2;
    END IF;

    RETURN $1;
END;
$$ LANGUAGE plpgsql;


create aggregate min(uuid) (
  sfunc = min,
  stype = uuid,
  combinefunc = min,
  parallel = safe,
  sortop = operator (<)
);

Now, build the tables just the same as before, use gen_random_uuid to autogenerate UUIDs:

postgres=# drop table t2;
postgres=# drop table t1;
postgres=# create table t1(id uuid default gen_random_uuid(), content varchar(10), primary key (id));
postgres=# create table t2(id int, t1_id uuid, foreign key (t1_id) references t1(id));
postgres=# insert into t1(content) ('a');
postgres=# insert into t1(content) values ('a');
postgres=# insert into t1(content) values ('b');
postgres=# insert into t1(content) values ('c');

We've successfully made three entries in t1. Add an entry to t2:

postgres=# select * from t1;
                  id                  | content
--------------------------------------+---------
 b6148ae3-db56-4a4a-8d46-d5b4f04277ac | a
 03abd324-8626-4fb1-9cb0-593373abf9ca | b
 9f12b297-3f60-48a7-8282-e27c3aff1152 | c
(3 rows)


postgres=# insert into t2 values(1, '9f12b297-3f60-48a7-8282-e27c3aff1152');

Try to select the row from t1 with a minimum ID that doesn't appear in t2, note that this fails.

postgres=# select min(t1.id) from t1 left outer join t2 on t1.id = t2.t1_id where t2.id is null;
 min
-----

(1 row)

Here we show that we can select the two unreferenced entries in t1 and we can select a minimum UUID independently:

postgres=# select t1.id from t1 left outer join t2 on t1.id = t2.t1_id where t2.id is null;
                  id
--------------------------------------
 03abd324-8626-4fb1-9cb0-593373abf9ca
 b6148ae3-db56-4a4a-8d46-d5b4f04277ac
(2 rows)

postgres=# select min(id) from t1;
                 min
--------------------------------------
 03abd324-8626-4fb1-9cb0-593373abf9ca
(1 row)

So, something funny goes on when I try to select a minimum UUID while also trying to perform the left outer join.

EDIT: the same problem exists when using not exists:

postgres=# select min(id) from t1 where not exists (select t1_id from t2 where t2.t1_id = t1.id);
 min
-----

(1 row)

but the problem doesn't appear when using not in:

postgres=# select min(id) from t1 where id not in (select t1_id from t2);
                 min
--------------------------------------
 03abd324-8626-4fb1-9cb0-593373abf9ca
(1 row)

Solution

  • Found a solution, turns out the function comparing UUIDs from this post isn't correct. Here's the function I wrote, which is probably less performant, which passes all the cases it failed at before:

        CREATE FUNCTION min_uuid(uuid, uuid)
        RETURNS uuid AS $$
        BEGIN
            -- if they're both null, return null
            IF $2 IS NULL AND $1 IS NULL THEN
                RETURN NULL ;
            END IF;
    
            -- if just 1 is null, return the other
            IF $2 IS NULL THEN
                RETURN $1;
            END IF ;
            IF $1 IS NULL THEN
                RETURN $2;
              END IF;
    
            -- neither are null, return the smaller one
            IF $1 > $2 THEN
                RETURN $2;
            END IF;
    
            RETURN $1;
        END;
        $$ LANGUAGE plpgsql;
    
    
        create aggregate min(uuid) (
          sfunc = min_uuid,
          stype = uuid,
          combinefunc = min_uuid,
          parallel = safe,
          sortop = operator (<)
        );