Apologies, this is similar to a question a recently asked and was answered.
I have a query below that is tracking customer purchases by the hour in order to determine when a customer is likely to place an order. The query appears to be working fine but I'm stuck with a few issues and I'm looking for help. First, instead of output of NULLS I want to display 0 (zero) for those hours a purchase wasn't made. I think COALESCE() is the right way to proceed but ran into syntax errors. Secondly, I want to display the customers FIRST_NAME, LAST_NAME after the customer_id and the total purchases after hour 23 for each customer_id. I am thinking LEFT JOIN as I want to display customers that have no purchases too. In my test CASE below that would be customer_id 2. Below is my test CASE and sample data. As always, if there is a better or simpler way to code this I would appreciate any input.
ALTER SESSION SET NLS_TIMESTAMP_FORMAT = 'DD-MON-YYYY HH24:MI:SS.FF';
CREATE TABLE customers (CUSTOMER_ID, FIRST_NAME, LAST_NAME) AS
SELECT 1, 'Faith', 'Aaron' FROM DUAL UNION ALL
SELECT 2, 'Lisa', 'Jones' FROM DUAL UNION ALL
SELECT 3, 'Roz', 'Doyle' FROM DUAL;
create table purchases(
ORDER_ID NUMBER GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
customer_id number,
PRODUCT_ID NUMBER,
QUANTITY NUMBER,
purchase_date timestamp
);
insert into purchases (customer_id, product_id, quantity, purchase_date)
select 1 customer_id, 102 product_id, 1 quantity,
TIMESTAMP '2024-04-03 00:00:00' + INTERVAL '23:27' HOUR TO MINUTE + ((LEVEL-1) * INTERVAL '1 00:00:01' DAY TO SECOND) * -1 + ((LEVEL-1) * interval '0.007125' second)
as purchase_date
from dual
connect by level <= 3 UNION all
select 1, 101, 1,
TIMESTAMP '2024-05-10 00:00:57' + INTERVAL '07:17' HOUR TO MINUTE + ((LEVEL-1) * INTERVAL '1 00:00:01' DAY TO SECOND) * -1 + ((LEVEL-1) * interval '0.000120' second)
from dual
connect by level <= 2 UNION all
select 1, 101, 1,
TIMESTAMP '2024-06-13 00:00:59.999999' + INTERVAL '23:14' HOUR TO MINUTE + ((LEVEL-1) * INTERVAL '1 00:00:00' DAY TO SECOND) * -1 + ((LEVEL-1) * interval '0.999999' second)
from dual
connect by level <= 1 UNION all
select 3, 100, 1,
TIMESTAMP '2024-06-16 00:00:00.888999' + INTERVAL '00:37' HOUR TO MINUTE + ((LEVEL-1) * INTERVAL '1 00:00:00' DAY TO SECOND) * -1 + ((LEVEL-1) * interval '0.999999' second)
from dual
connect by level <= 1 UNION all
select 3, 103, 3,
TIMESTAMP '2024-06-09 00:00:00' + INTERVAL '17:37' HOUR TO MINUTE + ((LEVEL-1) * INTERVAL '1 00:00:00' DAY TO SECOND) * -1 + ((LEVEL-1) * interval '0.009120' second)
from dual
connect by level <= 6;
SELECT *
FROM
(
SELECT
customer_id,
SUBSTR(TO_CHAR(PURCHASE_DATE ,'HH24'),1,2) tm
FROM purchases)
PIVOT
(
SUM (1)
FOR TM IN ('00','01','02','03','04','05','06','07','08','09','10','11','12','13','14','15','16','17','18','19','20','21','22','23'
)
) pv
ORDER BY customer_id;
First, instead of output of NULLS I want to display 0 (zero) for those hours a purchase wasn't made. I think COALESCE() is the right way to proceed but ran into syntax errors.
Use COUNT
instead of SUM
as COUNT
displays 0
when there are no matching rows but SUM
will output NULL
. Although if you are OUTER JOIN
ing the customers
table and you want to show those rows as 0
rather than NULL
then you will have to use COALESCE
(and, in that case, you could leave it as SUM
).
You are probably getting syntax errors as you have not given the pivoted columns aliases so you are getting column names like '00'
and to reference them you would need to use quoted identifiers "'00'"
. It is easier/nicer to alias the columns so that you do not have to use quoted identifiers.
Secondly, I want to display the customers FIRST_NAME, LAST_NAME after the customer_id
OUTER JOIN
the customers table (LEFT
or RIGHT
depending on whether it is before or after the PIVOT
, respectively).
and the total purchases after hour 23 for each customer_id.
Add the hourly totals:
SELECT c.*,
COALESCE(h0, 0) AS h0,
COALESCE(h1, 0) AS h1,
COALESCE(h2, 0) AS h2,
COALESCE(h3, 0) AS h3,
COALESCE(h4, 0) AS h4,
COALESCE(h5, 0) AS h5,
COALESCE(h6, 0) AS h6,
COALESCE(h7, 0) AS h7,
COALESCE(h8, 0) AS h8,
COALESCE(h9, 0) AS h9,
COALESCE(h10, 0) AS h10,
COALESCE(h11, 0) AS h11,
COALESCE(h12, 0) AS h12,
COALESCE(h13, 0) AS h13,
COALESCE(h14, 0) AS h14,
COALESCE(h15, 0) AS h15,
COALESCE(h16, 0) AS h16,
COALESCE(h17, 0) AS h17,
COALESCE(h18, 0) AS h18,
COALESCE(h19, 0) AS h19,
COALESCE(h20, 0) AS h20,
COALESCE(h21, 0) AS h21,
COALESCE(h22, 0) AS h22,
COALESCE(h23, 0) AS h23,
COALESCE(h0, 0)
+ COALESCE(h1, 0)
+ COALESCE(h2, 0)
+ COALESCE(h3, 0)
+ COALESCE(h4, 0)
+ COALESCE(h5, 0)
+ COALESCE(h6, 0)
+ COALESCE(h7, 0)
+ COALESCE(h8, 0)
+ COALESCE(h9, 0)
+ COALESCE(h10, 0)
+ COALESCE(h11, 0)
+ COALESCE(h12, 0)
+ COALESCE(h13, 0)
+ COALESCE(h14, 0)
+ COALESCE(h15, 0)
+ COALESCE(h16, 0)
+ COALESCE(h17, 0)
+ COALESCE(h18, 0)
+ COALESCE(h19, 0)
+ COALESCE(h20, 0)
+ COALESCE(h21, 0)
+ COALESCE(h22, 0)
+ COALESCE(h23, 0) AS total
FROM ( SELECT customer_id,
EXTRACT(HOUR FROM PURCHASE_DATE) AS hour
FROM purchases
)
PIVOT (
COUNT(1)
FOR hour IN (
0 AS H0, 1 AS H1, 2 AS H2, 3 AS H3, 4 AS H4, 5 AS H5,
6 AS H6, 7 AS H7, 8 AS H8, 9 AS H9, 10 AS H10, 11 AS H11,
12 AS H12, 13 AS H13, 14 AS H14, 15 AS H15, 16 AS H16, 17 AS H17,
18 AS H18, 19 AS H19, 20 AS H20, 21 AS H21, 22 AS H22, 23 AS H23
)
) pv
RIGHT OUTER JOIN customers c
ON c.customer_id = pv.customer_id
ORDER BY c.customer_id;
Alternatively, rather than using PIVOT
, you could use conditional aggregation:
SELECT c.customer_id,
MAX(c.first_name) AS first_name,
MAX(c.last_name) AS last_name,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 0 THEN 1 END) AS H0,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 1 THEN 1 END) AS H1,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 2 THEN 1 END) AS H2,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 3 THEN 1 END) AS H3,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 4 THEN 1 END) AS H4,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 5 THEN 1 END) AS H5,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 6 THEN 1 END) AS H6,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 7 THEN 1 END) AS H7,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 8 THEN 1 END) AS H8,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 9 THEN 1 END) AS H9,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 10 THEN 1 END) AS H10,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 11 THEN 1 END) AS H11,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 12 THEN 1 END) AS H12,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 13 THEN 1 END) AS H13,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 14 THEN 1 END) AS H14,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 15 THEN 1 END) AS H15,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 16 THEN 1 END) AS H16,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 17 THEN 1 END) AS H17,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 18 THEN 1 END) AS H18,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 19 THEN 1 END) AS H19,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 20 THEN 1 END) AS H20,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 21 THEN 1 END) AS H21,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 22 THEN 1 END) AS H22,
COUNT(CASE EXTRACT(HOUR FROM p.purchase_date) WHEN 23 THEN 1 END) AS H23,
COUNT(p.purchase_date) AS total
FROM customers c
LEFT OUTER JOIN purchases p
ON c.customer_id = p.customer_id
GROUP BY c.customer_id
ORDER BY c.customer_id;
Which, for the sample data, both output:
CUSTOMER_ID | FIRST_NAME | LAST_NAME | H0 | H1 | H2 | H3 | H4 | H5 | H6 | H7 | H8 | H9 | H10 | H11 | H12 | H13 | H14 | H15 | H16 | H17 | H18 | H19 | H20 | H21 | H22 | H23 | TOTAL |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Faith | Aaron | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 6 |
2 | Lisa | Jones | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | Roz | Doyle | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 7 |