Search code examples
sqlsql-serversql-server-2017

Avoid repetition of aggregated column values from 2 left joined tables


I have 5 tables which contains rows like below:

RESEARCH_REPORTS table:

REPORT_ID   TOPIC
141         My Report Topic Title
142         Another Report Topic Title
143         Yet Another Report Topic Title
...

PROGRAM_AREAS_REPORTS_REL table:

REPORT_ID    PROGRAM_AREA_ID
141          6
141          11
141          12
...

PROGRAM_AREAS table:

PROGRAM_AREA_ID    TITLE
6                  Program Area One
11                 Program Area Two
12                 Program Area Three
...

RESEARCH_REPORT_CATEGORY_REL table:

REPORT_ID    CATEGORY_ID
141          9
141          10
141          18
141          23
...

RESEARCH_REPORT_CATEGORIES table:

CATEGORY_ID    NAME
9              Category One
10             Category Two
18             Category Three
23             Category Four
...

This query is currently returning me the results from down below:

SELECT rr.report_id, 
rr.topic, 
string_agg(pa.title, '|') as program_areas, 
string_agg(rrc.name, '|') as categories 
FROM RESEARCH_REPORTS rr 
LEFT JOIN PROGRAM_AREAS_REPORTS_REL parr ON rr.report_id = parr.report_id 
LEFT JOIN RESEARCH_REPORT_CATEGORY_REL rrcr ON rr.report_id = rrcr.report_id 
LEFT JOIN PROGRAM_AREAS pa ON parr.program_area_id = pa.program_area_id 
LEFT JOIN RESEARCH_REPORT_CATEGORIES rrc ON rrcr.category_id = rrc.category_id 
WHERE rr.report_id = 141
GROUP BY rr.report_id, rr.topic

Query Results:

|---------------------|-------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|      report_id      |                topic                |                      program_areas                                                                                                                                |                                                                                     categories                                                                                                                                    |
|---------------------|-------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|          141        |         My Report Topic Title       |   Program Area One|Program Area Two|Program Area Three|Program Area One|Program Area Two|Program Area Three|Program Area One|Program Area Two|Program Area Three  |    Category One|Category Two|Category Three|Category Four|Category One|Category Two|Category Three|Category Four|Category One|Category Two|Category Three|Category Four|Category One|Category Two|Category Three|Category Four    |
|---------------------|-------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

If noticed, the aggregated values in the result columns program_areas and categories are repeated. I need to avoid these duplicates with a select query which results in the following format:

Preferred Query Results:

|---------------------|-------------------------------------|---------------------------------------------------------|--------------------------------------------------------------|
|      report_id      |                topic                |                      program_areas                      |                       categories                             |
|---------------------|-------------------------------------|---------------------------------------------------------|--------------------------------------------------------------|
|          141        |         My Report Topic Title       |   Program Area One|Program Area Two|Program Area Three  |    Category One|Category Two|Category Three|Category Four    |
|---------------------|-------------------------------------|---------------------------------------------------------|--------------------------------------------------------------|

How can I accomplish this on my current query?


Solution

  • You can try this

    SELECT  rr.report_id
            , rr.topic
            , ( SELECT  string_agg(pa.title, '|')
                FROM    PROGRAM_AREAS pa
                JOIN    PROGRAM_AREAS_REPORTS_REL parr
                ON      parr.program_area_id = pa.program_area_id 
                WHERE   parr.report_id = rr.report_id
                GROUP BY parr.report_id ) as program_areas
            , ( SELECT  string_agg(rrc.name, '|')
                FROM    RESEARCH_REPORT_CATEGORIES rrc 
                JOIN    RESEARCH_REPORT_CATEGORY_REL rrcr
                ON      rrcr.category_id = rrc.category_id 
                WHERE   rrcr.report_id = rr.report_id
                GROUP BY rrcr.report_id ) as categories
    FROM    RESEARCH_REPORTS rr 
    WHERE   rr.report_id = 141
    

    I'm not sure how to set order of names/titles that get concatenated by STRING_AGG though.