Search code examples
sqldatabasepostgresqlreport

How to create a Postgres report with columns whose conditions are mutually exclusive and share no keys?


Background

I have a Postgres 11 database running on RDS. I have two tables, orders and items, like so :

CREATE TABLE schema.orders (
  order_number TEXT,
  order_date TIMESTAMPTZ,
  sales_channel_name TEXT
);

CREATE TABLE schema.items (
 order_number TEXT REFERENCES schema.orders(order_number),
 key TEXT
 quantity INT
);

I need to create a report which shows, in two separate columns, the count of line items with some string in column key, and without some string in column key, grouped by the day of the associated order_date.

An example of the desired output is below:

              day         |  double_items_count               |  normal_items_count
-----------------------------------------------------------------------------------------------
   2020-04-09 00:00:00    |        22                         |     13

Each desired column can be defined as one of the below queries:

SELECT 
     date_trunc('day', o.order_date AT TIME ZONE 'America/Los_Angeles') AS day,
    2*(count(*)) AS double_items_count
    FROM 
    schema.items i  
    INNER JOIN
    schema.orders o 
    ON i.order_number = o.order_number
    WHERE 
    i.key ILIKE '%some_string%'
    AND o.sales_channel_name = 'foo_sales_channel'
    GROUP BY day 
    ORDER BY day DESC
;

SELECT 
    date_trunc('day', o.order_date AT TIME ZONE 'America/Los_Angeles') AS day,
    count(*) AS normal_items_count
    FROM 
    schema.items  
    INNER JOIN
    schema.orders_new o 
    ON i.order_number = o.order_number
    WHERE 
    i.key NOT ILIKE '%some_string%'
    AND o.sales_channel_name = 'foo_sales_channel'
    GROUP BY day 
    ORDER BY day DESC
;

I also have a view schema.items_from_channel that returns only those records I’m interested in :

CREATE VIEW schema.items_from_channel AS (
    SELECT 
    date_trunc('day', o.date at time zone 'America/Los_Angeles') as day,
    o.order_number,
    i.key,
    o.sales_channel
    FROM 
    schema.orders o 
    INNER JOIN
    schema.items i 
    ON 
    o.order_number = i.order_number 
    WHERE o.sales_channel = "foo_sales_channel"
    ORDER BY day DESC
);

The problem

Ordinarily, I’d handle this sort of thing with either a straight query against the view, or with a join of the two aforedescribed queries.

However, since the desired columns depend on mutually exclusive WHERE conditions, I don’t know how to construct a single query which will produce both (e.g., with subqueries).

Since the two queries share no key except day, I can’t figure out how to join them in a way that produces sensible results [joining on day, which would make sense to me, produces inflated numbers].

UNION doesn’t produce the desired result, as it returns all the data desired but does not maintain the required column formatting.

I’m stumped. How can I produce the desired report? I’ve been googling and combing through the relevant SE sites for the better part of the day, but haven’t hit on the solution. All guidance much appreciated!


Solution

  • With following data:

    select * from orders;
     order_number |     order_date      | sales_channel_name 
    --------------+---------------------+--------------------
                1 | 2020-04-09 01:00:00 | foo_sales_channel
                2 | 2020-04-09 02:00:00 | foo_sales_channel
                3 | 2020-04-09 03:00:00 | foo_sales_channel
                4 | 2020-04-09 04:00:00 | foo_sales_channel
    (4 rows)
    
    select * from items;
     id |      key       | order_number 
    ----+----------------+--------------
      1 | some_string    |            1
      2 | some_string    |            2
      3 | another_string |            3
      4 | another_string |            4
    (4 rows)
    
    SELECT 
         date_trunc('day', o.order_date AT TIME ZONE 'America/Los_Angeles') AS day,
        2*(count(*)) AS double_items_count
        FROM 
        items i  
        INNER JOIN
        orders o 
        ON i.order_number = o.order_number
        WHERE 
        i.key ILIKE '%some_string%'
        AND o.sales_channel_name = 'foo_sales_channel'
        GROUP BY day 
        ORDER BY day DESC;
              day           | double_items_count 
    ------------------------+--------------------
     2020-04-09 00:00:00+02 |                  4
    (1 row)
    
    SELECT 
        date_trunc('day', o.order_date AT TIME ZONE 'America/Los_Angeles') AS day,
        count(*) AS normal_items_count
        FROM 
        items  i
        INNER JOIN
        orders o 
        ON i.order_number = o.order_number
        WHERE 
        i.key NOT ILIKE '%some_string%'
        AND o.sales_channel_name = 'foo_sales_channel'
        GROUP BY day 
        ORDER BY day DESC
    ;
              day           | normal_items_count 
    ------------------------+--------------------
     2020-04-09 00:00:00+02 |                  2
    (1 row)
    

    Here is a solution:

    SELECT
        date_trunc('day', o.order_date AT TIME ZONE 'America/Los_Angeles') AS day,
        2*count(*) FILTER (WHERE i.key ILIKE '%some_string%') AS double_items_count,
        count(*)   FILTER (WHERE i.key NOT ILIKE '%some_string%') AS normal_items_count
        FROM
        items  i
        INNER JOIN
        orders o
        ON i.order_number = o.order_number
        WHERE
        o.sales_channel_name = 'foo_sales_channel'
        GROUP BY day
        ORDER BY day DESC
    ;
              day           | double_items_count | normal_items_count 
    ------------------------+--------------------+--------------------
     2020-04-09 00:00:00+02 |                  4 |                  2
    (1 row)