Search code examples
mysqlselectgroup-bysql-insertcreate-table

How to count percentage based on several condition mysql from different table


I have 2 tables like this with sales.id_Location = location.id_location (this is not real data, just data dummy) on sales table, id_order are the history of transaction, createdAt are the date transaction happen, sale are the amount of the transaction (kg), id_Location are the location of the shipping which connected with id_location in location table, createdby are the buyer.

CREATE TABLE sales
(
    id_order VARCHAR(50) NOT NULL,
    createdAt datetime NOT NULL,
    sale DECIMAL(14,2) NOT NULL,
    id_location varchar(50) NOT NULL,
    createdby varchar(50) NOT NULL,
    PRIMARY KEY(id_order,createdAt)
);

INSERT INTO sales (id_order, createdAt, sale, id_location, createdby)
VALUES(1,'2016-02-02',100, 1, 123),
      (2,'2017-03-02',150, 2, 233),
      (3,'2018-02-02',200, 3, 234),
      (4,'2016-03-03',150, 1, 123),
      (5,'2017-03-04',100, 2, 2334),
      (6,'2018-03-05',200,3, 234),
       (7,'2016-03-10',200, 1, 233),
      (8,'2017-02-01',150, 2, 124),
      (9,'2018-02-04',250, 3, 233),
      (10,'2018-02-05',300, 2, 124);

CREATE TABLE location
(
     id_location varchar(50) NOT NULL,
     location_city varchar(50) NOT NULL
);

INSERT INTO location(id_location, location_city)
VALUES (1, 'Jakarta'),
 (2, 'Depok'),
 (3, 'Bekasi');

select * from sales;
select * from location;

This is the fiddle https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=eac3dc2845bfa425fbd576cc18c72609

In this case I used mysql version 5.7, I want to find out statistic of sales for each location with this condition

  1. the sales are between '2016-02-01' until '2018-03-10'

  2. the buyers (column createdby) are doing transaction before '2018-03-10' and at least doing transaction again between '2016-02-01' - '2018-03-10',

So if the buyers are just doing transaction one time, or doing transaction more than one time but there's no transaction at all in between '2016-02-01' until '2018-03-10' then the buyers are not counted and not included

Based on that condition and based on the data dummy, the expected results just like this:

+----------+----------+---------+----------------+--------------------+
| Location | sale(kg) | sale(%) | count id_order | count id_order (%) |
+----------+----------+---------+----------------+--------------------+
| Jakarta  |      450 |   26,48 |              3 |              33,33 |
| Depok    |      600 |   35,30 |              3 |              33,33 |
| Bekasi   |      650 |   38,22 |              3 |              33,33 |
| TOtal    |     1700 |     100 |              9 |                100 |
+----------+----------+---------+----------------+--------------------+

This is my SQL statement:

SELECT 
  IFNULL(location.location_city, 'Total') AS `Location`, 
  SUM(sale) AS `sale(kg)`,
  SUM(sale) / (SELECT SUM(sale) FROM sales) * 100 AS `sale (%)`, 
  COUNT(id_order) AS `count(id_order)`,
  COUNT(id_order) / (SELECT COUNT(id_order) FROM sales) * 100 AS `count(id_order) (%)`
FROM sales, location
where sales.id_location = location.id_location
and createdAt <= '2018-03-04'
and EXISTS (select 1 from sales s2, location l2 where
sales.id_location = s2.id_location
and sales.id_location = l2.id_location and
createdAt >= '2016-02-01'
and createdAt <= '2018-03-04')
GROUP BY location WITH ROLLUP
having count(createdby) > 1;

This is the fiddle https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=eac3dc2845bfa425fbd576cc18c72609


Solution

  • Test

    SELECT COALESCE(location_city, 'Total') AS `Location`, 
           SUM(sale) AS `sale(kg)`,
           SUM(sale) / ANY_VALUE(totalsum) * 100 AS `sale (%)`, 
           COUNT(id_order) AS `count(id_order)`,
           COUNT(id_order) / ANY_VALUE(totalcount) * 100 AS `count(id_order) (%)`
    FROM sales
    NATURAL JOIN location
    NATURAL JOIN ( SELECT s1.createdby
                   FROM sales s1
                   GROUP BY s1.createdby
                   HAVING SUM(s1.createdAt BETWEEN '2016-02-01' AND '2018-03-04')
                      AND SUM(s1.createdAt <= '2018-03-04') > 1 ) clients
    JOIN ( SELECT SUM(sale) totalsum, 
                  COUNT(id_order) totalcount 
           FROM sales ) totals
    GROUP BY location_city WITH ROLLUP
    

    fiddle (see comments in the fiddle).


    the total on percent in sale and count id_order should be 100 because it's count overall statistic for the date range not for overall data on data dummy – Fachry Dzaky

    If so these total values must be calculated separately. Test

    SELECT COALESCE(location_city, 'Total') AS `Location`, 
           SUM(sale) AS `sale(kg)`,
           SUM(sale) / ANY_VALUE(totalsum) * 100 AS `sale (%)`, 
           COUNT(id_order) AS `count(id_order)`,
           COUNT(id_order) / ANY_VALUE(totalcount) * 100 AS `count(id_order) (%)`
    FROM sales
    NATURAL JOIN location
    NATURAL JOIN ( SELECT s1.createdby
                   FROM sales s1
                   GROUP BY s1.createdby
                   HAVING SUM(s1.createdAt BETWEEN '2016-02-01' AND '2018-03-04')
                      AND SUM(s1.createdAt <= '2018-03-04') > 1 ) clients
    JOIN ( SELECT SUM(sale) totalsum, 
                  COUNT(id_order) totalcount 
           FROM sales
           NATURAL JOIN ( SELECT s1.createdby
                          FROM sales s1
                          GROUP BY s1.createdby
                          HAVING SUM(s1.createdAt BETWEEN '2016-02-01' AND '2018-03-04')
                             AND SUM(s1.createdAt <= '2018-03-04') > 1 ) clients ) totals
    GROUP BY location_city WITH ROLLUP
    

    fiddle