Given the following data:
name | temp
-----------
hoi | 15
hoi | 15
hoi | 16
hoi | 15
hej | 13
hoi | 13
I would like to select the data in the given two columns without duplicates, However I do want to keep duplicates that are duplicates if they where interrupted by another value:
name | temp
-----------
hoi | 15 // selected
hoi | 15 // ignored duplicate
hoi | 15 // ignored duplicate
hoi | 16 // selected
hoi | 15 // selected because while not being unique it follows a different value
hoi | 15 // ignored duplicate
hej | 13 // selected
hoi | 13 // selected
hoi | 13 // ignored duplicate
hoi | 14 // selected
hoi | 13 // selected because while not being unique it follows a different value
This question was hard to formulate for me given English is not my native tongue, Feel free to edit the question or ask for clarifications.
Edit: There is an id field and a datetime field.
Edit 2: I use mySQL 5.7
Since you are using MySQL 5.7, which doesn't support analytical functions, you will need to use variables to store the values of temp and name, from the previous row:
SELECT t.ID,
t.Name,
t.Temp
FROM ( SELECT t.*,
IF(@temp = t.temp AND @name = t.Name, 1, 0) AS IsDuplicate,
@temp:= t.temp,
@name:= t.Name
FROM YourTable AS t
CROSS JOIN (SELECT @temp := 0, @name := '') AS v
ORDER BY t.ID
) AS t
WHERE t.IsDuplicate = 0
ORDER BY ID;
The key parts are (not in the order in which they appear, but in the order in which it is logical to think about it).
(1) Initialise the variables, and order by ID (or whatever field(s) you like) to ensure variables are assigned in the correct order
CROSS JOIN (SELECT @temp := 0, @name := '') AS v
ORDER BY t.ID
(2) Check if the values stored in the variables matches the current row, and flag with a 1 or a 0
IIF(@temp = t.temp AND @name = t.Name, 1, 0) AS IsDuplicate
(3) Assign the values of temp and name in the current row to the variables, so they can be checked against the next row:
@temp:= t.temp,
@name:= t.Name
(4) Remove duplicates from the final data set:
WHERE t.IsDuplicate = 0;
To go one further, you could change the IsDuplicate
flag to be a group marker, and use GROUP BY
, so you can find out how many records there were in total, while still not displaying duplicates:
SELECT MIN(ID) AS FirstID,
t.Name,
t.Temp,
COUNT(*) AS Records,
MAX(ID) AS LastID
FROM ( SELECT t.*,
@group:= IF(@temp = t.temp AND @name = t.Name, @group, @group + 1) AS GroupID,
@temp:= t.temp,
@name:= t.Name
FROM YourTable AS t
CROSS JOIN (SELECT @temp := 0, @name := '', @group:= 0) AS v
ORDER BY t.ID
) AS t
GROUP BY t.GroupID, t.Name, t.Temp
ORDER BY t.GroupID;
This may be surplus to requirements, but it can be useful as you are able to extract a lot more information than when just identifying duplicate rows.
Finally if/when you upgrade to version 8.0 or newer, you will be able to use ROW_NUMBER()
, or if you move to any other DBMS that supports ROW_NUMBER()
(which is most nowadays), then you can use the following:
SELECT MIN(ID) AS FirstID,
t.Name,
t.Temp,
COUNT(*) AS Records,
MAX(ID) AS LastID
FROM ( SELECT t.*,
ROW_NUMBER() OVER(ORDER BY ID) -
ROW_NUMBER() OVER(PARTITION BY Temp, Name ORDER BY ID) AS GroupID
FROM YourTable AS t
ORDER BY t.ID
) AS t
GROUP BY t.GroupID, t.Name, t.Temp
ORDER BY t.GroupID;