Search code examples
sqloracle-databaseoracle11gaverageaggregate-functions

Finding rows which match the average value of a column


I'm using Oracle 11g. I have a couple of tables which look something like this (simplified a bit).

CREATE TABLE clients (
    id NUMBER(10) NOT NULL,
    gridx NUMBER(10),
    gridy NUMBER(10),
    CONSTRAINT clients_pk PRIMARY KEY (id)
);

CREATE TABLE requests (
    id NUMBER(10) NOT NULL,
    client_id NUMBER(10),
    CONSTRAINT clients_fk FOREIGN KEY (client_id) REFERENCES clients(id),
    CONSTRAINT requests_pk PRIMARY KEY (id)
);

I need to find the grid sections which have seen the average number of requests. So far, I have come up with the following query which lists the number of requests made for each distinct grid location.

SELECT joined_tbl.gridx, joined_tbl.gridy, COUNT(joined_tbl.id) requests_cnt FROM (
    SELECT c.gridx, c.gridy, r.id
    FROM requests r
    INNER JOIN clients c ON c.id=r.client_id GROUP BY c.gridx, c.gridy, r.id ORDER BY r.id
) joined_tbl
GROUP BY joined_tbl.gridx, joined_tbl.gridy;

This gives the following output

 GRIDX      GRIDY REQUESTS_CNT
     1         -3            2
     2          5            4
    -1         -3            4
    -3         -2            6

Next I need to take the average of the REQUESTS_CNT column and list all the rows which match the average value. How can I go about this? I can't use the AVG function in a WHERE clause, which I considered, so should I be using HAVING?


Solution

  • Your query can be boiled down to:

    SELECT c.gridx, c.gridy, COUNT(*) AS requests_cnt
    FROM requests r 
    JOIN clients c ON c.id = r.client_id 
    GROUP BY c.gridx, c.gridy;
    

    i.e. get the request count per gridx/gridy. You want to determine the average count and then only show gridx/gridy pairs that occur exactly that often. In your example this would be the pairs (2|5), (-1|-3), but most often, I guess, that would be no pairs at all.

    The easiest approach seems to be to get the average on-the-fly by applying AVG OVER:

    SELECT gridx, gridy
    FROM
    (
      SELECT 
        c.gridx, c.gridy, COUNT(*) AS requests_cnt,
        AVG(COUNT(*)) OVER () AS avg_requests_cnt
      FROM requests r 
      JOIN clients c ON c.id = r.client_id 
      GROUP BY c.gridx, c.gridy
    ) pairs
    WHERE requests_cnt = avg_requests_cnt
    ORDER BY gridx, gridy;