Is it possible to query multiple schema's within my Redshift database and combine the results?
Something like this:
SELECT DISTINCT name, SUM(usage) AS totalUsage
FROM schemaOne.terms, schemaTwo.terms
will throw an error:
Amazon Invalid operation: column reference "name" is ambiguous;
So, is this possible and would this be a good practice? I would like to create a schema for each client which will have it's own terms (tags) table and other tables.
For example
SchemaOne: terms
id, name, description, usage
_______________________________
1 css CSS is bla. 14
2 html HTML rocks! 9
SchemaTwo: terms
id, name, description, usage
_______________________________
1 css CSS is cool. 8
2 other x 4
Now I would like to combine the results, which could output something like:
name totalUsage
_________________
css 22
html 9
other 4
First do UNION ALL
in a derived table, then apply GROUP BY
:
select name, sum(usage) as totalUsage
from
(
select name, usage
from SchemaOne.terms
union all
select name, usage
from SchemaTwo.terms
) dt
group by name
Note that some products have name
and usage
as reserved words, so you may need to delimit them as "name"
and/or "usage"
.
Perhaps better performance version:
select name, sum(usage) as totalUsage
from
(
select name, sum(usage) as usage
from SchemaOne.terms
group by name
union all
select name, sum(usage) as usage
from SchemaTwo.terms
group by name
) dt
group by name
Original answer:
If the same column name exists in more than one table, you have to qualify that column, i.e put the table name before the column name:
SELECT DISTINCT schemaOne.terms.name, SUM(usage) as totalUsage
FROM schemaOne.terms, schemaTwo.terms
Table aliases are convenient:
SELECT DISTINCT s1t.name, SUM(usage) as totalUsage
FROM schemaOne.terms s1t, schemaTwo.terms s2t