sql amazon-web-services database-schema amazon-redshift multiple-tables

AWS Redshift query multiple schema's

Is it possible to query multiple schema's within my Redshift database and combine the results?

Something like this:

SELECT DISTINCT name, SUM(usage) AS totalUsage
FROM schemaOne.terms, schemaTwo.terms

will throw an error:

Amazon Invalid operation: column reference "name" is ambiguous;

So, is this possible and would this be a good practice? I would like to create a schema for each client which will have it's own terms (tags) table and other tables.

For example

SchemaOne: terms

id,  name,  description,  usage
_______________________________
1    css    CSS is bla.   14
2    html   HTML rocks!   9

SchemaTwo: terms

id,  name,  description,  usage
_______________________________
1    css    CSS is cool.  8
2    other  x             4

Now I would like to combine the results, which could output something like:

name   totalUsage
_________________
css    22
html   9
other  4

Solution

First do UNION ALL in a derived table, then apply GROUP BY:

select name, sum(usage) as totalUsage
from
(
    select name, usage
    from SchemaOne.terms
    union all
    select name, usage
    from SchemaTwo.terms
) dt
group by name

Note that some products have name and usage as reserved words, so you may need to delimit them as "name" and/or "usage".

Perhaps better performance version:

select name, sum(usage) as totalUsage
from
(
    select name, sum(usage) as usage
    from SchemaOne.terms
    group by name
    union all
    select name, sum(usage) as usage
    from SchemaTwo.terms
    group by name
) dt
group by name

Original answer:

If the same column name exists in more than one table, you have to qualify that column, i.e put the table name before the column name:

SELECT DISTINCT schemaOne.terms.name, SUM(usage) as totalUsage
FROM schemaOne.terms, schemaTwo.terms

Table aliases are convenient:

SELECT DISTINCT s1t.name, SUM(usage) as totalUsage
FROM schemaOne.terms s1t, schemaTwo.terms s2t