Search code examples
azure-databricksdatabricks-sql

Databricks SQL Error: Add to group by or wrap in first() (or first_value) if you don't care which value you get


I'm getting the error:

TableName is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;

When I execute the following code in Databricks SQL

SELECT 
ts_originationopportunity,
concat_ws(distinctreferrals.referralcompanycontact,'; ')  AS referralcompanycontacts
FROM
    (SELECT DISTINCT
     ts_originationopportunity,
     IFNULL(ts_referralcontactname + ' at ','') + ts_referralcompanyname AS referralcompanycontact
     FROM baseorigination.ts_referralsource) distinctreferrals
GROUP BY ts_originationopportunity

I had already received assistance with a similar question by @Tim Biegeleisen, however when applied the suggested code from the previous question I get no results from 'referralcompanycontact'

SELECT 
ts_originationopportunity,
array_join(collect_set(distinctreferrals.referralcompanycontact),'; ')  AS referralcompanycontacts
FROM
    (SELECT DISTINCT
     ts_originationopportunity,
     IFNULL(ts_referralcontactname + ' at ','') + ts_referralcompanyname AS referralcompanycontact
     FROM baseorigination.ts_referralsource) distinctreferrals
GROUP BY ts_originationopportunity

Any thoughts?

I have now tried the following:

SELECT
  ts_referralsource.ts_originationopportunity
 ,CONCAT_WS(ts_referralsource.ts_referralcontactname, ' at ', ts_referralsource.ts_referralcompanyname), ';') AS referralcompanycontact
FROM baseorigination.ts_referralsource
GROUP BY ts_referralsource.ts_originationopportunity

But getting the error:

== SQL ==
SELECT
  ts_referralsource.ts_originationopportunity
 ,CONCAT_WS(ts_referralsource.ts_referralcontactname, ' at ', ts_referralsource.ts_referralcompanyname), ';') AS referralcompanycontact
------------------------------------------------------------------------------------------------------------^^^

ok, I almost figured it out.

The following will get me my results without the groupby

SELECT
 ts_originationopportunity
 ,CONCAT(ts_referralcontactname, ' at ', ts_referralcompanyname) AS referralcompanycontact
FROM baseorigination.ts_referralsource
-- GROUP BY ts_referralsource.ts_originationopportunity

However, with the groupby I the following error:

'spark_catalog.baseorigination.ts_referralsource.ts_referralcontactname' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;

Solution

  • I got a similar error when I tried to use group by in a similar manner. The following is a sample table on which I have tried to use similar query:

    enter image description here

    • When I tried the following query (similar to yours), I got the same error:
    SELECT a,CONCAT(b, ' at ', c) AS new FROM demo GROUP BY a
    

    enter image description here

    • This is because you are not aggregating the columns that you are selecting i.e., b and c in the above query (ts_referralcontactname and ts_referralcompanyname in your case).

    • So as suggested in the error message, you have to apply an appropriate aggregate function. I have used first in my case and got the desired result.

    SELECT a,CONCAT(first(b), ' at ', first(c)) AS new FROM demo GROUP BY a
    

    enter image description here

    • Now, since you are getting desired result even without group by, change your query as shown below:
    SELECT
     ts_originationopportunity
     ,CONCAT(first(ts_referralcontactname), ' at ', first(ts_referralcompanyname)) AS referralcompanycontact
    FROM baseorigination.ts_referralsource GROUP BY ts_referralsource.ts_originationopportunity