Search code examples
sqlpostgresqlpostgresql-8.1

SQL query to search an unique ID that can be in three different tables


I have three tables that control products, colors and sizes. Products can have or not colors and sizes. Colors can or not have sizes.

product      color                           size
-------      -------                         -------
id           id                              id
unique_id    id_product (FK from product)    id_product (FK from version)
stock        unique_id                       id_version (FK from version)
title        stock                           unique_id
                                             stock

The unique_id column, that is present in all tables, is a serial type (autoincrement) and its counter is shared with the three tables, basically it works as a global unique ID between them.

It works fine, but i am trying to increase the query performance when i have to select some fields based in the unique_id.

As i don't know where is the unique_id that i am looking for, i am using UNION, like below:

select title, stock
from product 
where unique_id = 10

UNION

select p.title, c.stock
from color c
join product p on c.id_product = p.id
where c.unique_id = 10

UNION

select p.title, s.stock
from size s
join product p on s.id_product = p.id
where s.unique_id = 10;

Is there a better way to do this? Thanks for any suggestion!

EDIT 1

Based on @ErwinBrandstetter and @ErikE answers i decided to use the below query. The main reasons is:

1) As unique_id has indexes in all tables, i will get a good performance

2) Using the unique_id i will find the product code, so i can get all columns i need using a another simple join

SELECT 

    p.title,
    ps.stock

FROM (

    select id as id_product, stock
    from product 
    where unique_id = 10

    UNION

    select id_product, stock
    from color
    where unique_id = 10

    UNION

    select id_product, stock
    from size
    where unique_id = 10

) AS ps

JOIN product p ON ps.id_product = p.id;

Solution

  • PL/pgSQL function

    To solve the problem at hand, a plpgsql function like the following should be faster:

    CREATE OR REPLACE FUNCTION func(int)
      RETURNS TABLE (title text, stock int) LANGUAGE plpgsql AS
    $BODY$
    BEGIN
    
    RETURN QUERY
    SELECT p.title, p.stock
    FROM   product p
    WHERE  p.unique_id = $1; -- Put the most likely table first.
    
    IF NOT FOUND THEN
        RETURN QUERY
        SELECT p.title, c.stock
        FROM   color c
        JOIN   product p ON c.id_product = p.id
        WHERE  c.unique_id = $1;
    END;
    
    IF NOT FOUND THEN
        RETURN QUERY
        SELECT p.title, s.stock
        FROM   size s
        JOIN   product p ON s.id_product = p.id
        WHERE  s.unique_id = $1;
    END IF;
    
    END;
    $BODY$;
    

    Updated function with table-qualified column names to avoid naming conflicts with OUT parameters.

    RETURNS TABLE requires PostgreSQL 8.4, RETURN QUERY requires version 8.2. You can substitute both for older versions.

    It goes without saying that you need to index the columns unique_id of every involved table. id should be indexed automatically, being the primary key.


    Redesign

    Ideally, you can tell which table from the ID alone. You could keep using one common sequence, but add 100000000 for the first table, 200000000 for the second and 300000000 for the third - or whatever suits your needs. This way, the least significant part of the number is easily distinguishable.

    A plain integer spans numbers from -2147483648 to +2147483647, move to bigint if that's not enough for you. I would stick to integer IDs, though, if possible. They are smaller and faster than bigint or text.


    CTEs (experimental!)

    If you cannot create a function for some reason, this pure SQL solution might do a similar trick:

    WITH x(uid) AS (SELECT 10) -- provide unique_id here
        , a AS (
        SELECT title, stock
        FROM   x, product 
        WHERE  unique_id = x.uid
        )
        , b AS (
        SELECT p.title, c.stock
        FROM   x, color c
        JOIN   product p ON c.id_product = p.id
        WHERE  NOT EXISTS (SELECT 1 FROM a)
        AND    c.unique_id = x.uid
        )
        , c AS (
        SELECT p.title, s.stock
        FROM   x, size s
        JOIN   product p ON s.id_product = p.id
        WHERE  NOT EXISTS (SELECT 1 FROM b)
        AND    s.unique_id = x.uid
        )
    SELECT * FROM a
    UNION ALL
    SELECT * FROM b
    UNION ALL
    SELECT * FROM c;
    

    I am not sure whether it avoids additional scans like I hope. Would have to be tested. This query requires at least PostgreSQL 8.4.


    Upgrade!

    As I just learned, the OP runs on PostgreSQL 8.1.
    Upgrading alone would speed up the operation a lot.


    Query for PostgreSQL 8.1

    As you are limited in your options, and a plpgsql function is not possible, this function should perform better than the one you have. Test with EXPLAIN ANALYZE - available in v8.1.

    SELECT title, stock
    FROM   product 
    WHERE  unique_id = 10
    
    UNION ALL
    SELECT p.title, ps.stock
    FROM   product p
    JOIN  (
        SELECT id_product, stock
        FROM   color
        WHERE  unique_id = 10
    
        UNION ALL
        SELECT id_product, stock
        FROM   size
        WHERE  unique_id = 10
        ) ps ON ps.id_product = p.id;