Search code examples
sqlmysqlgroup-by

Select MAX () and GROUP BY ... Missing link between columns


I am still at beginner level on SQL programming so please bear in mind ...

My database is similar to the below:

+---------------------+------------+--------+------+--+
| Date                | FileName   | Type   | File |  |
+---------------------+------------+--------+------+--+
| 2023-06-15 08:03:04 | InputParam | Param  | .... |  |
+---------------------+------------+--------+------+--+
| 2023-06-15 09:15:34 | InputParam | Param  | .... |  |
+---------------------+------------+--------+------+--+
| 2023-06-16 01:05:07 | OutputSet  | Set    | .... |  |
+---------------------+------------+--------+------+--+
| 2023-06-16 11:22:04 | ConnConfig | Config | .... |  |
+---------------------+------------+--------+------+--+
| 2023-06-16 17:33:00 | ConnConfig | Config | .... |  |
+---------------------+------------+--------+------+--+
| 2023-06-17 06:06:28 | ConnConfig | Config | .... |  |
+---------------------+------------+--------+------+--+

I need to create a table showing all the above columns but with the most recent entries only, thus, similar to the below:

+---------------------+------------+--------+------+--+
| Date                | FileName   | Type   | File |  |
+---------------------+------------+--------+------+--+
| 2023-06-15 09:15:34 | InputParam | Param  | .... |  |
+---------------------+------------+--------+------+--+
| 2023-06-16 01:05:07 | OutputSet  | Set    | .... |  |
+---------------------+------------+--------+------+--+
| 2023-06-17 06:06:28 | ConnConfig | Config | .... |  |
+---------------------+------------+--------+------+--+

My strategy was to:

SELECT MAX(Date) As Date, FileName, Type, File FROM myTable WHERE File IS NOT NULL GROUP BY FileName;

However, this is not holding the relation between the FileName and the File. The Date and FileName are filtered and grouped as expected.

Please note that File is BLOB type.


Solution

  • You can try this

    WITH RankedEntries AS (
        SELECT
            Date,
            FileName,
            Type,
            File,
            ROW_NUMBER() OVER (PARTITION BY FileName ORDER BY Date DESC) AS rn
        FROM
            myTable
        WHERE
            File IS NOT NULL
    )
    SELECT
        Date,
        FileName,
        Type,
        File
    FROM
        RankedEntries
    WHERE
        rn = 1;
    

    Here, RankedEntries CTE assigns a row number to each entry within each FileName partition based on the Date in descending order

    Query selects the rows from the RankedEntries CTE where rn is equal to 1. This ensures that you get the most recent entry for each FileName while retaining all the columns (Date, FileName, Type, and File).