Search code examples
mysqlgroup-bysum

SQL - Columns merge and grouping by column name while summation of the other column


I have a question related to the SQL query. The data looks like this:

Name1 Name2 Category Total
ABC NULL Category1 100
DEF ABC Category1 20
GHI ABC Category3 300
XYZ DEF Category2 60
XYZ GHI Category3 80
NULL NULL Category2 90

I am trying to merge the Name1 and Name2 and grouping them based on Category with the Total having the sum if same name have same categories.

Thanks in advance

The method I tried were this (kinda similar to these two queries asked in stackoverflow): Match two columns and add the third column in SQL SQL: GROUP BY column B for groups in column a

pysqldf = lambda q1: sqldf(q1, globals())
q1 = """SELECT Name, SUM(Total) AS Total
FROM
(
    SELECT Name1 AS Name, SUM(Total) AS Total
    FROM df
    WHERE Name1 IS NOT NULL
    GROUP BY Name1
    UNION ALL
    SELECT Name2, SUM(Total)
    FROM df
    WHERE Name2 IS NOT NULL
    GROUP BY Name2
) t
GROUP BY Name
ORDER BY Name"""
df2 = pysqldf(q1)

Tried in pandas also but it gives different results with the following command:

df2 = df.groupby(['investigator','second_investigator','requested_test']).agg({'Count': 'sum'})

The output required is like this:

Name Category Total
ABC Category1 120
ABC Category3 300
DEF Category1 20
DEF Category2 60
GHI Category3 380
XYZ Category2 60
XYZ Category3 80

Solution

  • SELECT Name, Category, SUM(Total) AS Total
    FROM
    (
        SELECT Name1 AS Name, Category, SUM(Total) AS Total
        FROM df
        WHERE Name1 IS NOT NULL
        GROUP BY 1, 2
        UNION ALL
        SELECT Name2, Category, SUM(Total)
        FROM df
        WHERE Name2 IS NOT NULL
        GROUP BY 1, 2
    ) t
    GROUP BY 1, 2
    ORDER BY 1
    

    fiddle