Supposing I have a table and query:
consisting of population for a given country in a given continent for a given year.
i want to return countries avg(population) & the continents avg population if the country's population is greater than the continents +3 Basically I want to filter out rows that are a certain difference away from the subtotal continent value.
I modified this and realize the data does not have multiple years and that the numbers are obviously garbage, but this is just an example.
create table abc (continent varchar2(30), country varchar2(30), population number, yr number)
insert into abc values ('africa', 'kenya', 50, 2005)
insert into abc values ('africa', 'egypt', 100, 2006)
insert into abc values('africa', 'south africa', 35, 2007)
insert into abc values ('africa', 'nigeria', 200, 2008)
insert into abc values ('asia', 'china', 50, 2005)
insert into abc values ('asia', 'india', 100, 2006)
insert into abc values('asia', 'japan', 35, 2007)
insert into abc values ('asia', 'korea', 200, 2008)
select continent, country, avg(population)
from abc
where ------population for each country > 3+ avg for each continent
----should return egpyt/nigeria rows and india/korea rows since average here is 96.25 for each continent.
group by rollup(continent, country)
So, with the definition of continent average as simply being the average of all rows with that continent, a solution can be:
select continent
, country
, avg(population) country_avg
, max(continent_avg) continent_avg
from (
select continent
, country
, population
, avg(population) over (
partition by continent
) continent_avg
from abc
)
group by continent, country
having avg(population) > max(continent_avg) + 3
order by continent, country;
The reason I asked about the definition of continent average is, that if some countries within a continent have more rows in the table (=more years), those countries will weigh more in the average calculated like that. Then an alternative can be that the continent average is the average of the country averages, in which case a solution can be:
select *
from (
select continent
, country
, avg(population) country_avg
, avg(avg(population)) over (
partition by continent
) continent_avg
from abc
group by continent, country
)
where country_avg > continent_avg + 3;
If the countries all have the same number of years (same number of rows), the two solutions ought to give the same result. But if countries can have different number of years, you will have to pick the solution that fits your requirements.