I have a items_invoiced table I have ordered that invoice table based on service id and invoice date so you will see groups of invoice_items that have renew peridically, every month. year. etc.
items_invoiced
invoice_id | service_id | invoice_date | plan_id | paid_amount |
---|---|---|---|---|
01 | 1 | 2024-03-08 | 2 | 30.00 |
09 | 1 | 2023-03-08 | 1 | 25.00 |
08 | 2 | 2024-04-08 | 1 | 30.00 |
034 | 2 | 2024-03-08 | 1 | 25.00 |
013 | 3 | 2024-03-08 | 2 | 30.00 |
010 | 3 | 2023-03-08 | 1 | 25.00 |
013 | 3 | 2021-03-08 | 1 | 30.00 |
010 | 3 | 2020-03-08 | 1 | 25.00 |
I need to find a way to add another column called 'plan_upgrade' that will look into each group of invoices that belong to a service_id and if there's a change in the plan_id (previous plan id es less than current plan_id) it would return TRUE, if not FALSE
items_invoiced_enhanced
invoice_id | service_id | invoice_date | plan_id | paid_amount | plan_upgrade |
---|---|---|---|---|---|
01 | 1 | 2024-03-08 | 2 | 30.00 | TRUE |
09 | 1 | 2023-03-08 | 1 | 25.00 | FALSE |
08 | 2 | 2024-04-08 | 1 | 30.00 | FALSE |
034 | 2 | 2024-03-08 | 1 | 25.00 | FALSE |
013 | 3 | 2024-03-08 | 2 | 30.00 | TRUE |
010 | 3 | 2023-03-08 | 1 | 25.00 | FALSE |
013 | 3 | 2021-03-08 | 1 | 30.00 | FALSE |
010 | 3 | 2020-03-08 | 1 | 25.00 | FALSE |
notice that the table above shows 3 services, ids 1, 2 and 3, the first service gets renewed every year and the plan id changed on the second renewal, the 2nd service gets renewed every month but the plan id did not changed on the renewal, the third gets also renewed every year for the past 4 years and the upgrade (change of plan id from 1 to 2) happened on the last renewal year
is there a way to do this in BQ?
I'm aware thereś a window function, but understanding how it works it's sort of a challenge at my level of of SQl and BQ knowledge, so if someone can explain to me how this would actually be accomplished with a window function or another approach, I´d really appreciate it.
Window functions perform calculations over a set of rows, which is defined by OVER
. In your case, service_id
and the last invoice_date
is the set you want to have the calculation performed on. PARTITION BY
says the independent grouping in the row subsets you want the calculation performed on, service_id
. ORDER BY
gives a ordering that you want the calculation to occur, invoice_date
since you want to see what happened on the previous date. The LAG()
function looks at the previous row (which is sorted now) and in your case what was the previous plan_id
for the last service_date
for this particular service_id
.
-- INIT database
CREATE TABLE invoices (
invoice_id VARCHAR(10),
service_id INT,
invoice_date DATE,
plan_id INT,
paid_amount DECIMAL(10, 2)
);
INSERT INTO invoices (invoice_id, service_id, invoice_date, plan_id, paid_amount) VALUES
('01', 1, '2024-03-08', 2, 30.00),
('09', 1, '2023-03-08', 1, 25.00),
('08', 2, '2024-04-08', 1, 30.00),
('034', 2, '2024-03-08', 1, 25.00),
('013', 3, '2024-03-08', 2, 30.00),
('010', 3, '2023-03-08', 1, 25.00),
('013', 3, '2021-03-08', 1, 30.00),
('010', 3, '2020-03-08', 1, 25.00);
-- QUERY database
SELECT
invoice_id,
service_id,
invoice_date,
plan_id,
paid_amount,
CASE
WHEN plan_id != LAG(plan_id) OVER (PARTITION BY service_id ORDER BY invoice_date) THEN TRUE
ELSE FALSE
END AS plan_upgrade
FROM invoices
ORDER BY service_id, invoice_date;