Search code examples
mysqlsqlgroup-bysum

How to sum value by DATE_FORMAT(date,'%Y-%m-%d') limit 20, and sum remaining value if greater than 20?


How to sum value by DATE_FORMAT(date,'%Y-%m-%d') and id the first 20 rows of data, and sum the remaining value if greater than 20 else 0? supposed I have below data with below SQL, how to do it? thanks so much for any advice.

SELECT SUM(value), id, DATE_FORMAT(date,'%Y-%m-%d') 
FROM test_table
GROUP BY id, DATE_FORMAT(date,'%Y-%m-%d')
sum(value)  id  DATE_FORMAT(date,'%Y-%m-%d')
-------------------------------------------- 
64.98   123456  2021-01-01
64.98   123456  2021-01-02
64.98   123456  2021-01-03
64.98   123456  2021-01-04
64.98   123456  2021-01-05
64.98   123456  2021-01-06
72.34   123456  2021-01-07
64.98   123456  2021-01-08
64.98   123456  2021-01-09
103.80  123456  2021-01-10
64.98   123456  2021-01-11
64.98   123456  2021-01-12
64.98   123456  2021-01-13
64.98   123456  2021-01-14
64.98   123456  2021-01-15
64.98   123456  2021-01-16
64.98   123456  2021-01-17
64.98   123456  2021-01-18
64.98   123456  2021-01-19
64.98   123456  2021-01-20
64.98   123456  2021-01-21
64.98   123456  2021-01-22
64.98   123456  2021-01-23
64.98   123456  2021-01-24
144.98  123456  2021-01-25
64.98   123456  2021-01-26
64.98   123456  2021-01-27
64.98   123456  2021-01-28
64.98   123456  2021-01-29
64.98   123456  2021-01-30
64.98   123456  2021-01-31
64.98   123456  2021-02-01
64.98   123456  2021-02-02
64.98   123456  2021-02-03
64.98   123456  2021-02-04
720.92  123456  2021-02-05
66.98   123456  2021-02-06
66.98   123456  2021-02-07
66.98   123456  2021-02-08
64.98   123456  2021-02-09
64.98   123456  2021-02-10
64.98   123456  2021-02-11
64.98   123456  2021-02-12
64.98   123456  2021-02-13
64.98   123456  2021-02-14
64.98   123456  2021-02-15
64.98   123456  2021-02-16
64.98   123456  2021-02-17
64.98   123456  2021-02-18
64.98   123456  2021-02-19
64.98   123456  2021-02-20
64.98   223456  2021-01-20
54.98   223456  2021-01-21

...................................


Solution

  • If you're using MySQL version that supports window function then probably:

    SELECT id,
           SUM(CASE WHEN rn <= 20 THEN vals ELSE 0 END) AS 'SumOf1st20',
           SUM(CASE WHEN rn > 20 THEN vals ELSE 0 END) AS 'SumOFOthers'
    FROM
    (SELECT SUM(value) AS vals, id, DATE_FORMAT(date,'%Y-%m-%d') AS dt,
           ROW_NUMBER() OVER (PARTITION BY id ORDER BY dt DESC) AS rn
    FROM test_table
    GROUP BY id, dt) v
    GROUP BY id;
    

    But I wonder if this is exactly what you want since your sample data seems to be a result of your original query.

    Here's a sample fiddle for reference.

    This here is a way to replicate ROW_NUMBER() function in older MySQL version:

    SELECT t1.*,
           CASE WHEN id=@idx THEN @rnk := @rnk+1 
            ELSE @rnk := 1 END AS rn,
            @idx := id
     FROM (SELECT id, 
              SUM(value) AS vals, DATE_FORMAT(date,"%Y-%m-%d") AS dt 
       FROM test_table 
       GROUP BY id, dt)t1
     CROSS JOIN (SELECT @rnk := 0, @idx := NULL) r
     ORDER BY id, dt DESC
    

    Fiddle

    Although, (in rare occasions) I had experience where this doesn't perform as I expect it to be: the row numbering sometimes gets messed up.

    With further testing, that "rare occasion" seems to be happening here. Since your original query include aggregation, I took that and compare it against the v8 compatible ROW_NUMBER() and I got different results. Upon observation, the row numbers were assigned before the ORDER BY xxx DESC occur. So, visually it does sort by date descending accordingly but the rn column "incorrectly" generated. The workaround I found was to perform the aggregation in subquery first then only attempt to generate the custom row numbering. See my updated query and fiddle above.