Search code examples
sqlmariadbmariadb-10.1

Count repeat records per day (without window functions)


I'm trying to get a count of repeat customer records per day and I'm having a bit of trouble using MariaDB 10.1 as window functions weren't introduced until 10.2 (therefore no partitioning, rank, etc)

I have an example set of data that looks like this:

| Date       | Country | Type      | Email         | Response_Time |
| ---------- | ------- | --------- | ------------- | ------------- |
| 2021-05-21 | AU      | Enquiry   | [email protected] | 910           |
| 2021-05-21 | AU      | Enquiry   | [email protected] | 1050          |
| 2021-05-21 | NZ      | Complaint | [email protected]  | 56            |
| 2021-05-22 | NZ      | Enquiry   | [email protected]  | 1000          |
| 2021-05-22 | NZ      | Enquiry   | [email protected]  | 845           |
| 2021-05-22 | NZ      | Enquiry   | [email protected]  | 700           |
| 2021-05-22 | NZ      | Complaint | [email protected]  | 217           |
| 2021-05-23 | UK      | Enquiry   | [email protected] | 843           |
| 2021-05-23 | NZ      | Enquiry   | [email protected]  | 1795          |
| 2021-05-23 | NZ      | Enquiry   | [email protected]  | 521           |
| 2021-05-23 | AU      | Complaint | [email protected] | 150           |

The above can be produced with the following query:

SELECT
    DATE(Start_Time) AS "Date",
    Country,
    Type,
    Email,
    Response_Time
FROM EMAIL_DETAIL
WHERE DATE(Start_Time) BETWEEN '2021-05-21' AND '2021-05-23'
AND COUNTRY IN ('AU','NZ','UK')
;

I'd like to get a count of email addresses that appear more than once in the group of day, country and type, and display it as a summary like this:

| Country | Type      | Volume | Avg_Response_Time | Repeat_Daily |
| ------- | --------- | ------ | ----------------- | ------------ |
| AU      | Enquiry   | 2      | 980               | 1            |
| AU      | Complaint | 1      | 150               | 0            |
| NZ      | Enquiry   | 5      | 972               | 3            |
| NZ      | Complaint | 1      | 137               | 0            |
| UK      | Enquiry   | 1      | 843               | 0            |

The repeat daily count is a count of records where the email address appeared more than once in the group of date, country and type. Volume is the total count of records per country and type.

I'm having a hard time with the lack of window functions in this version of MariaDB and any help would really be appreciated.

(Apologies for the tables formatted as code, I was getting a formatting error when trying to post otherwise)


Solution

  • Hmmm . . . I think this is two levels of aggregation:

    SELECT country, type, SUM(cnt) as volume,
           SUM(Total_Response_Time) / SUM(cnt) as avg_Response_time,
           SUM(CASE WHEN cnt > 1 THEN cnt ELSE 0 END) as repeat_daily
    FROM (SELECT DATE(Start_Time) AS "Date", Country, Type, Email,
                 SUM(Response_Time) as Total_Response_Time, COUNT(*) as cnt
          FROM EMAIL_DETAIL
          WHERE DATE(Start_Time) BETWEEN '2021-05-21' AND '2021-05-23' AND
                COUNTRY IN ('AU','NZ','UK')
          GROUP BY date, country, type, email
         ) ed
    GROUP BY country, type