Search code examples
mysqlreplaceaggregate-functionscoalescerevenue

How can I replace null values in 1 column with the values from another column when working with an aggregate function? - SQL


I am working within the Stripe sigma application on Stripe's website and I am trying to calculate revenue by country. When customers fill in their payment information, they are asked to input their country but some fail to do so. The column is called card_country_address. When the payment is processed, the country where the card was issued is also recorded in a column called card_country.

enter image description here

When calculating the revenue, I am summing the amounts and grouping by card_country_address. However, because there are null values due to customers not filling in the correct information, I get 1 blank value in my card_address_country and an unassigned dollar amount. So, I would like to replace the null values with the value from card_country and include that into the calculation.

I read that I can replace the null values with COALESCE and I have written a statement to do so, however I am not sure how to include this correctly into a complete statement.

Here is what I have:

SELECT COALESCE(card_address_country, card_country)
FROM charges
WHERE card_address_country IS NULL

This seems to replace the null values correctly. The statement that I have to calculate the revenue by country is:

SELECT card_address_country AS "Country", sum(amount) AS "Total Revenue"
FROM charges
WHERE amount_refunded = 0 and paid = true
GROUP BY card_address_country
ORDER BY sum(amount) DESC

This is the problem I am having with this statement:

enter image description here

$12,934,033 isn't counted correctly because the card_address_country is null for numerous transactions.

I know some might be thinking, just group by card_country then. However, this column also contains null values and I would still have to do the same and add the missing country code from card_country_address. Also, card_address_country and card_country aren't always the same. I am using card_address_country because I would like to know where the customer purchased from.

How do I include the statement with COALESCE into the main statement so that I can distribute the amount linked to the null addressed correctly? Is that even the right way to go about this or is there a better method to replace and calculate the values?


Solution

  • Looks like GROUP BY COALESCE(card_address_country, card_country) will get the desired result set

    SELECT COALESCE(card_address_country, card_country) AS "Country", SUM(amount) AS "Total Revenue"
    FROM charges
    WHERE amount_refunded = 0 and paid = true
    GROUP BY COALESCE(card_address_country, card_country)
    ORDER BY SUM(amount) DESC