Search code examples
sqlpostgresqlaggregatepostgisrank

Rank order ST_DWithin results by the number of radii a result appears in


I have a table of existing customers and another table of potential customers. I want to return a list of potential customers rank ordered by the number of radii of existing purchasers that they appear in.

There are many rows in the potential customers table per each existing customer, and the radius around a given existing customer could encompass multiple potential customers. I want to return a list of potential customers ordered by the count of the existing customer radii that they fall within.

SELECT pur.contact_id AS purchaser, count(pot.*) AS nearby_potential_customers 
FROM purchasers_geocoded pur, potential_customers_geocoded pot 
WHERE ST_DWithin(pur.geom,pot.geom,1000)
GROUP BY purchaser;

Does anyone have advice on how to proceed?

EDIT:

With some help, I wrote this query, which seems to do the job, but I'm verifying now.

WITH prequalified_leads_table AS (
   SELECT *
   FROM nearby_potential_customers
   WHERE market_val > 80000
   AND   market_val < 120000
   )
, proximate_to_existing AS (
   SELECT pot.prop_id AS prequalified_leads
   FROM purchasers_geocoded pur, prequalified_leads_table pot
   WHERE ST_DWithin(pot.geom,pur.geom,100)
  )
SELECT prequalified_leads, count(prequalified_leads)
FROM proximate_to_existing
GROUP BY prequalified_leads
ORDER BY count(*) DESC;

Solution

  • I want to return a list of potential customers ordered by the count of the existing customer radii that they fall within.

    Your query tried the opposite of your statement, counting potential customers around existing ones.
    Inverting that, and after adding some tweaks:

    SELECT pot.contact_id AS potential_customer
         , rank() OVER (ORDER BY pur.nearby_customers DESC
                               , pot.contact_id) AS rnk
         , pur.nearby_customers 
    FROM   potential_customers_geocoded pot
    LEFT   JOIN LATERAL (
       SELECT count(*) AS nearby_customers 
       FROM   purchasers_geocoded pur
       WHERE  ST_DWithin(pur.geom, pot.geom, 1000)
       ) pur ON true
    ORDER  BY 2;
    

    I suggest a subquery with LEFT JOIN LATERAL ... ON true to get counts. Should make use of the spatial index that you undoubtedly have:

    CREATE INDEX ON purchasers_geocoded USING gist (geom);
    

    Thereby retaining rows with 0 nearby customers in the result - your original join style would exclude those. Related:

    Then ORDER BY the resulting nearby_customers in the outer query (not: nearby_potential_customers).

    It's not clear whether you want to add an actual rank. Use the window function rank() if so. I made the rank deterministic while being at it, breaking ties with an additional ORDER BY expression: pot.contact_id. Else, peers are returned in arbitrary order which can change for every execution.

    ORDER BY 2 is short syntax for "order by the 2nd out column". See:

    Related: