I have a table and the data is in below format.
I need the query in databricks sql
I have tried self join and some lag functions but it doesn't work. Please let me know if I am not making any sense.
Thanks, MG
You can do join on the tables filtered on column Subcategory
to get your desired results.
Use below code for it.
SELECT Programid, Category, Subcategory, ReportMon, ReportYea,
Oct, Nov, Dec, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep
FROM original_data
UNION ALL
SELECT o.Programid, o.Category, '3. Net' AS Subcategory, '11/2/2023' AS ReportMon, 'FY2024' AS ReportYea,
o.Oct - e.Oct, o.Nov - e.Nov, o.Dec - e.Dec, o.Jan - e.Jan, o.Feb - e.Feb,
o.Mar - e.Mar, o.Apr - e.Apr, o.May - e.May, o.Jun - e.Jun, o.Jul - e.Jul, o.Aug - e.Aug, o.Sep - e.Sep
FROM (SELECT * from original_data where Subcategory='1. Adjusted') o
JOIN (SELECT * from original_data where Subcategory='2. Estimated') e
ON o.Programid = e.Programid
AND o.Category = e.Category
AND o.ReportMon = e.ReportMon
AND o.ReportYea = e.ReportYea
Here, i am filtering the records on Subcategory
and joining them on same
Programid
,Category
,ReportMon
and ReportYea
.
Output:
Programid | Category | Subcategory | ReportMon | ReportYea | Oct | Nov | Dec | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
12234 | ABC | 1. Adjusted | 11/1/2023 | FY2023 | 23455.55 | 45464.43 | 456959.60 | 154531.60 | 54521.45 | 511231.50 | 35235.67 | 47575.23 | 48645.87 | 54524.89 | 11353.14 | 13515.45 |
12234 | ABC | 2. Estimated | 11/1/2023 | FY2023 | 15311.23 | 54465.35 | 53531.45 | 5453.45 | 546.40 | 5856.45 | 5454.56 | 452.42 | 458.50 | 5661.46 | 1542.45 | 5145.67 |
12234 | ABC | 3. Net | 11/2/2023 | FY2024 | 8144.32 | -9000.92 | 403428.15 | 149078.15 | 53975.05 | 505375.05 | 29781.11 | 47122.81 | 48187.37 | 48863.43 | 9810.69 | 8369.78 |