Search code examples
sqlexceljet

Select Distinct using JET and no PKEY - Duplicate rows


There seems to be a lot of threads on this topic but few that work with Excel.

I have a simple table from which i want to select:

  1. ideally all columns i.e. using * if possible so if a user adds new columns they do not need to edit SQL. (is this a pipe dream?) if so a solution specifying all the returned columns is OK.

  2. only return rows where [name]&[date] (concatenated) is distinct

  3. all other columns i don't care about which row is returned. first, last, limit 1... anything. they are a mix of all types.

  4. this must not create a new table or delete rows, just selecting and joining

    name    date        sales
    andy    01/01/2010  100
    andy    01/01/2010  900
    andy    05/01/2010  100
    alex    02/02/2010  200
    alex    02/02/2010  200
    alex    05/01/2010  200
    dave    09/09/2010  300
    dave    09/09/2010  300
    dave    01/09/2010  300
    

Also code simplicity is prefered over speed. This is going to be left to run over night so nice looking but slow is fine... and excel doesn't have millions of rows!

Many thanks to everyone in advance.

UPDATE

I would expect the table to look like this:

name    date        sales
andy    01/01/2010  100
andy    05/01/2010  100
alex    02/02/2010  200
alex    05/01/2010  200
dave    09/09/2010  300
dave    01/09/2010  300

            or
andy    01/01/2010  900
andy    05/01/2010  100
alex    02/....

I can select all the 'unique things with this:

SELECT MAX(joined)
    FROM
        (SELECT [Single$].[date] AS [date],
          [Single$].[name] AS [name],
           name & date AS [joined]
        FROM [Single$]
        )
GROUP  BY joined
HAVING MAX(joined) IS NOT NULL

But i don't know how to somehow join this back to the original table keeping any single row where the join matches. And i don't know if a join is the right way about this? Thanks


Solution

  • Simply run an aggregate query grouped by [Name] and [Date]. For all other columns run an aggregate like MAX() or MIN() which should work on numeric and string values.

    SELECT  [Single$].[name] AS [name], [Single$].[date] AS [date],
            MAX([Single$].[sales]) As [sales],
            MAX(...)       
    FROM [Single$]
    GROUP BY [Single$].[name] AS [name], [Single$].[date] AS [date]