Search code examples
mysqlmysql-5.7

mysql query to summarize data


I am trying to summarize some data from two data sources. The tables are following

Table t1

auto_id unique_column value
1 d1 10
2 d2 5
3 d3 15

Table t2

auto_id unique_column value
null d3 5
null d4 6

Expected data in t1 after query:

auto_id unique_column value
1 d1 10
2 d2 5
3 d3 20
4 d4 6

To solve this I was planning to do the following query

REPLACE INTO t1
SELECT auto_id, unique_column, SUM(value) value FROM 
(
  SELECT * FROM t1
  UNION ALL 
  SELECT * FROM t2
) GROUP BY unique_column; 

Issue with this query

SELECT auto_id, unique_column, SUM(value) value FROM 
(
  SELECT * FROM t1
  UNION ALL 
  SELECT * FROM t2
) GROUP BY unique_column; 

The above query will give us

auto_id unique_column value
1 d1 10
2 d2 5
3 d3 20
NULL d4 6

Which is okay. In the insert query we will get the following result:

auto_id unique_column value
1 d1 10
2 d2 5
3 d3 20
4 d4 6

But the following query

SELECT auto_id, unique_column, SUM(value) value FROM 
(
  SELECT * FROM t2
  UNION ALL 
  SELECT * FROM t1
) GROUP BY unique_column; 

will give us

auto_id unique_column value
1 d1 10
2 d2 5
NULL d3 20
NULL d4 6

So, Insert query will product the following unexpected result

auto_id unique_column value
1 d1 10
2 d2 5
3 d3 15
4 d3 20
5 d4 6

Alternative thought was, full outer join but mysql decided not to implement this. What other alternative can be done with mysql?


Solution

  • You were close:

    REPLACE INTO t1
    SELECT MAX(auto_id), unique_column, SUM(value) value FROM 
    (
      SELECT * FROM t1
      UNION ALL 
      SELECT * FROM t2
    ) t
    GROUP BY unique_column;
    

    MAX(auto_id) will skip NULL fields, which will give us just 3 for 'd3'; for the 'd4' case, auto_id will still be NULL, but (as long as the column is declared as AUTO_INCREMENT NOT NULL), NULL will be replaced by the next number in the sequence.

    EDIT for your edit: as long as you properly aggregate auto_id using MAX (or MIN), the only scenario NULL appears is if you have no prior auto_id; it is not sensitive to ordering.