I am fairly new to SQL. I have the following table structure. For a given SITEID and a given YEAR, I am trying to find the "new" CLASS that was added i.e a CLASS that was not present in the previous year. Each site can have multiple classes.
<table><tbody><tr><th>SITEID</th><th>YEAR</th><th>CLASS</th></tr><tr><td>1</td><td>2007</td><td>A</td></tr><tr><td>1</td><td>2007</td><td>B</td></tr><tr><td>1</td><td>2008</td><td>A</td></tr><tr><td>1</td><td>2008</td><td>B</td></tr><tr><td>1</td><td>2008</td><td>C</td></tr></tbody></table>
In the above case, the final output that I am looking for is:
<table><tbody><tr><th>SITEID</th><th>YEAR</th><th>CLASS</th></tr><tr><td>1</td><td>2007</td><td> </td></tr><tr><td>1</td><td>2008</td><td>C</td></tr></tbody></table>
Would really appreciate your help. Thanks.
Please try this (please replace temp table with your table):
If you are using SQL server, then you can use Common Table Expressions (CTE) as blow:
IF (OBJECT_ID('tempdb..#temp_table') IS NOT NULL)
BEGIN
DROP TABLE #temp_table
END;
CREATE TABLE #temp_table (SiteId INT NOT NULL, [year] INT, [class] [Char] NOT NULL)
INSERT INTO #temp_table (SiteId, [year], [class])
values
(1, 2007, 'A'),
(1, 2007, 'B'),
(1, 2008, 'A'),
(1, 2008, 'B'),
(1, 2008, 'C')
;with temp_cte as
(SELECT siteid, [year], [year]-1 as [yearbefore]
,STUFF((SELECT '|' + CAST([class] AS VARCHAR(MAX)) [text()]
FROM #temp_table
WHERE SiteId = t.SiteId and [year] = t.[year]
FOR XML PATH(''), TYPE)
.value('.','NVARCHAR(MAX)'),1,2,' ') [class_list]
FROM #temp_table t
GROUP BY SiteId, [year]
)
select c.SiteId
, c.[year]
, case when t.yearbefore = null then null else right(REPLACE(c.class_list, t.class_list, ''), LEN(REPLACE(c.class_list, t.class_list, ''))-1) end as [class_added]
, case when t.yearbefore = null then null else stuff(REPLACE(c.class_list, t.class_list, ''), 1, charindex('|', REPLACE(c.class_list, t.class_list, '')), '') end as [class_added_using_stuff]
from temp_cte c
left join temp_cte t on c.[yearbefore] = t.[year]
If you are not using SQL Server (the below works in SQL server as well), then you can use joins (probably need to adjust the queries respective to RDBMS that is in use) as below:
select distinct t1.SiteId, t1.[year]
, case when t1.[year] = m.[year] then null else t1.[class] end as class_added
from #temp_table t1
left join #temp_table t2 on t1.SiteId = t2.SiteId and t1.class = t2.class and t1.[year] -1 = t2.[year]
left join (select top 1 * from #temp_table order by [year] ) m on t1.[year] = m.[year]
where t2.SiteId is null
UPDATE for MYSQL:
CREATE TABLE test_table(
SiteId INT NOT NULL,
year INT,
class CHAR(1)
);
INSERT INTO
test_table(SiteId, year, class)
VALUES
(1, 2007, 'A'),
(1, 2007, 'B'),
(1, 2008, 'A'),
(1, 2008, 'B'),
(1, 2008, 'C');
select distinct t1.SiteId, t1.year
, case when t1.year = m.year then null else t1.class end as class_added
from test_table t1
left join test_table t2 on t1.SiteId = t2.SiteId and t1.class = t2.class and t1.year -1 = t2.year
left join (select * from test_table order by year limit 1) m on t1.year = m.year
where t2.SiteId is null;
MY SQL Fiddle here (http://sqlfiddle.com/#!9/c570d57/1/0)