I need to find most frequent value of object_of_search
for each ethnicity. How can I achieve this? Subqueries in the SELECT
clause and correlated subqueries are not allowed. Something similar to this:
mode() WITHIN GROUP (ORDER BY stopAndSearches.object_of_search) AS "Most frequent object of search"
But this does not aggregate and gives me many rows for each ethnicity and object_of_search:
officer_defined_ethnicity | Sas for ethnicity | Arrest rate | Most frequent object of search
---------------------------+-------------------+------------------+--------------------------------
ethnicity2 | 3 | 66.6666666666667 | Stolen goods
ethnicity3 | 2 | 100 | Fireworks
ethnicity1 | 5 | 60 | Firearms
ethnicity3 | 2 | 100 | Firearms
ethnicity1 | 5 | 60 | Cat
ethnicity1 | 5 | 60 | Dog
ethnicity2 | 3 | 66.6666666666667 | Firearms
ethnicity1 | 5 | 60 | Psychoactive substances
ethnicity1 | 5 | 60 | Fireworks
And should be something like this:
officer_defined_ethnicity | Sas for ethnicity | Arrest rate | Most frequent object of search
---------------------------+-------------------+------------------+--------------------------------
ethnicity2 | 3 | 66.6666666666667 | Stolen goods
ethnicity3 | 2 | 100 | Fireworks
ethnicity1 | 5 | 60 | Firearms
Table on fiddle.
Query:
SELECT DISTINCT
stopAndSearches.officer_defined_ethnicity,
count(stopAndSearches.sas_id) OVER(PARTITION BY stopAndSearches.officer_defined_ethnicity) AS "Sas for ethnicity",
sum(case when stopAndSearches.outcome = 'Arrest' then 1 else 0 end)
OVER (PARTITION BY stopAndSearches.officer_defined_ethnicity)::float /
count(stopAndSearches.sas_id) OVER(PARTITION BY stopAndSearches.officer_defined_ethnicity)::float * 100 AS "Arrest rate",
mode() WITHIN GROUP (ORDER BY stopAndSearches.object_of_search) AS "Most frequent object of search"
FROM stopAndSearches
GROUP BY stopAndSearches.sas_id, stopAndSearches.officer_defined_ethnicity;
Table:
CREATE TABLE IF NOT EXISTS stopAndSearches(
"sas_id" bigserial PRIMARY KEY,
"officer_defined_ethnicity" VARCHAR(255),
"object_of_search" VARCHAR(255),
"outcome" VARCHAR(255)
);
Updated: Fiddle
This should address the specific "which object per ethnicity" question.
Note, this doesn't address ties in the count. That wasn't part of the question / request.
Adjust your SQL to include this logic, to provide that detail:
WITH cte AS (
SELECT officer_defined_ethnicity
, object_of_search
, COUNT(*) AS n
, ROW_NUMBER() OVER (PARTITION BY officer_defined_ethnicity ORDER BY COUNT(*) DESC) AS rn
FROM stopAndSearches
GROUP BY officer_defined_ethnicity, object_of_search
)
SELECT * FROM cte
WHERE rn = 1
;
Result:
officer_defined_ethnicity | object_of_search | n | rn |
---|---|---|---|
ethnicity1 | Cat | 1 | 1 |
ethnicity2 | Stolen goods | 2 | 1 |
ethnicity3 | Fireworks | 1 | 1 |