Search code examples
sqlgoogle-bigquery

SUM this column


I want to SUM the total amount of the user payments, but the result was different.

I tried to use JOIN to combine 3 table and add the list user_id, list survey, and total amount for each user. But, the total amount result wasn't the same as I did in separate table.

SELECT
users.user_id,
COUNT(DISTINCT participations.survey_id) as total_survey,
SUM(amount) as total_amount
FROM `angket.users` as users
RIGHT JOIN `angket.participations`as participations
using (user_id)
RIGHT JOIN `angket.payments`as payments
USING (user_id)
GROUP BY 1
ORDER BY 1

with JOIN

SELECT
user_id,
SUM(amount) as total_amount
FROM `angket.payments` as payments
LEFT JOIN `angket.users`as users
USING(user_id)
GROUP BY 1
ORDER BY 1

1


Solution

  • The answer is relatively easy, once you compare the sums in one and the other query:

    • User_id 'U0000' participated 8 times
    • User_id 'U0001' participated 4 times
    • User_id 'U0002' participated 3 times

    this lead to an erronous partial Cartesian product, so that the occurrences of the various payments were multiplied by the number of participations, so that the amounts were summed 8, 4 and 3 times in the query joining with partitipations.

    But - as you need a count distinct from participations , try two different groupings in two nested queries:

    SELECT
      users.user_id
    , COUNT(DISTINCT participations.survey_id) as total_survey
    , total_amount
    FROM angket.users as users
    RIGHT JOIN angket.participationsas participations
    USING (user_id)
    RIGHT JOIN (
      SELECT
        user_id
      , SUM(amount) AS total_§amount
      FROM angket.payments 
      GROUP BY user_id
    ) sumpayments
    USING (user_id)
    GROUP BY 1, 3
    ORDER BY 1