Search code examples
sqlamazon-web-servicesgroup-byamazon-athenapyathena

SYNTAX_ERROR: '"LastName"' must be an aggregate expression or appear in GROUP BY clause


I have a two tables, main_table & staging_table, main_table contains original data whereas staging_table contains the few of the updated records that I have to add into with main_table data, and for that I am using unique id -PersonID and arrival time - date Below is the query which I am able to execute into SQL

SELECT PersonID, LastName, FirstName, Address, City, max(date) 
from 
(
select PersonID, LastName, FirstName, Address, City, date from main_table
UNION
select PersonID, LastName, FirstName, Address, City, date from staging_table
) as t
GROUP by t.PersonID;

but while executing into AWS Athena, I am getting following error, SYNTAX_ERROR: '"LastName"' must be an aggregate expression or appear in GROUP BY clause


Solution

  • I suspect that the other columns might differ and you actually want the full record from the most recent date. If this is the case, use row_number():

    select p.*
    from (select p.*,
                 row_number() over (partition by personid order by date desc) as seqnum
          from ((select PersonID, LastName, FirstName, Address, City, date
                 from main_table
                ) union all
                (select PersonID, LastName, FirstName, Address, City, date
                 from staging_table
                )
               ) p
         ) p
    where seqnum = 1;
    

    This select one row per PersonId with the most recent date. The columns come from the most recent row.