Search code examples
salesforcesoql

Find Duplicate, SPAM Accounts in Salesforce - SOQL Query


I am trying to find and then delete the duplicates accounts object in salesforce

SELECT 
  Name, 
  BillingStreet, 
  BillingCity, 
  BillingState, 
  BillingPostalCode, 
  BillingCountry, 
  COUNT(Id) 
FROM 
  Account 
GROUP BY 
  Name, 
  BillingStreet, 
  BillingCity, 
  BillingState, 
  BillingPostalCode, 
  BillingCountry 
HAVING 
  COUNT(Id) > 1

I tried to run the query, but it throws the following error

[object Object]: Aggregate query does not support queryMore(), use LIMIT to restrict the results to a single batch

Is this the right query to pull the duplicates? How do Identify the SPAM? The spam will either have some special characters, or names like aaaaa or bbbbb.


Solution

  • What you have should work, but I'm betting that the total record count from "SELECT Id FROM Account" is quite high (perhaps > 50,000). In which case, you may need to batch it and add "LIMIT N" to the query, where N is something at, or below, whatever upper limit (e.g. 50000) is being reached for the aggregate query.

    If this is the case, you will have remaining records after that LIMIT to check. One way to "batch" the queries is to take the last Name that is returned in the previous "batch" and add a 'WHERE Name>{lastName}" to a subsequent query.

    Lets say for example, this is query #1:

    SELECT 
      Name, 
      BillingStreet, 
      BillingCity, 
      BillingState, 
      BillingPostalCode, 
      BillingCountry, 
      COUNT(Id) 
    FROM 
      Account 
    GROUP BY 
      Name, 
      BillingStreet, 
      BillingCity, 
      BillingState, 
      BillingPostalCode, 
      BillingCountry 
    HAVING 
      COUNT(Id) > 1
    LIMIT
      50000
    

    If the lase Name returned in that list is "TNT COMPANY" the 2nd query could be:

    SELECT 
      Name, 
      BillingStreet, 
      BillingCity, 
      BillingState, 
      BillingPostalCode, 
      BillingCountry, 
      COUNT(Id) 
    FROM 
      Account 
    WHERE
      Name>'TNT COMPANY'
    GROUP BY 
      Name, 
      BillingStreet, 
      BillingCity, 
      BillingState, 
      BillingPostalCode, 
      BillingCountry 
    HAVING 
      COUNT(Id) > 1
    LIMIT
      50000
    

    And you can iterate as necessary to run through all records and avoid the queryMore() error you're receiving with the aggregate query. I imagine after removing some of the accounts deemed as spam, you might not run into that query limit (as often).