Search code examples
mysqlgreatest-n-per-group

MySQL DISTINCT for not all columns


CREATE TABLE IF NOT EXISTS FRUITS (
  ID INT(11) NOT NULL AUTO_INCREMENT,
  owner VARCHAR(45) DEFAULT NULL,
  fruit VARCHAR(45) DEFAULT NULL,
  colour VARCHAR(45) DEFAULT NULL,
  comments VARCHAR(45) DEFAULT NULL,
  incident_date DATE DEFAULT NULL,
  PRIMARY KEY (ID)
) ENGINE=MyISAM;
INSERT INTO `your_db`.`FRUITS` (ID, owner, fruit, colour, comments, incident_date) VALUES 
  (NULL, 'Sam', 'apple', 'green', 'bought', '2016-11-01'),
  (NULL, 'John', 'lemon', 'yellow', 'borrowed', '2016-11-02'),
  (NULL, 'Oscar', 'lemon', 'yellow', 'found', '2016-11-03'),
  (NULL, 'Oscar', 'apple', 'green', 'stolen', '2016-11-04'),
  (NULL, 'Sam', 'kiwi', 'green', 'inherited', '2016-11-05'),
  (NULL, 'Oscar', 'apple', 'green', 'eaten', '2016-11-06'),
  (NULL, 'Oscar', 'apple', 'yellow', 'grown', '2016-11-09');

I have a Datatable with people and their possessions that gets its data from MySQL query, and I need to filter out the apples that Oscar has handled. I don't care if they were found, stolen or eaten, but I need to get the last incident (that it was eaten on November 6th). Basically, only 2 rows:

Oscar | apple | green  | eaten | 2016-11-06
Oscar | apple | yellow | grown | 2016-11-09

This one:

SELECT DISTINCT * FROM FRUITS 
  WHERE owner LIKE 'Osc%' AND fruit LIKE 'apple' AND colour LIKE '%%'
  ORDER BY owner ASC

returns 3 rows (green ones both stolen and eaten), but I need only the latter.

It would be nice to have a possibility to separate Distinct columns like this:

SELECT comments, incident_date, DISTINCT owner, fruit, colour 
  FROM FRUITS WHERE owner LIKE 'Osc%' AND fruit LIKE 'apple' AND colour LIKE '%%' 
  ORDER BY owner ASC

This possibility doesn't exist. Is there any others?

(Real case scenario is that advertisement proposals in a magazine are made to customers. Some proposals are made at different times to same area (ex. 1/2 page) at different times. I only need to filter out that a proposal to this customer for this ad size is rejected at that date. Two different rejections are confusing.)


Solution

  • This is a variation of the very common question.

    How can you get a row that is the most recent row for a given combination of owner/fruit/colour?

    In other words, try to join such a row (call it F1) to another row (call it F2) with the same owner/fruit/colour, and a more recent id. If no match is found, the outer join will return all NULLs for F2.*.

    SELECT F1.* FROM FRUITS AS F1
    LEFT OUTER JOIN FRUITS AS F2
      ON F1.owner = F2.owner
      AND F1.fruit = F2.fruit
      AND F1.colour = F2.colour
      AND F1.ID < F2.ID
    WHERE F2.ID IS NULL AND F1.owner = 'Oscar' AND F1.fruit = 'apple';