Search code examples
sqlpostgresqlpostgresql-9.3

How to make query that will return differences between two tables in PostgreSQL


I need to return differences between two tables.

CREATE TEMPORARY TABLES

CREATE TEMP TABLE first(
  zoom smallint NOT NULL,
  x integer NOT NULL,
  y integer NOT NULL
);

CREATE TEMP TABLE second(
  zoom smallint NOT NULL,
  x integer NOT NULL,
  y integer NOT NULL
);

INSERT DATA

INSERT INTO first(zoom,x,y) VALUES(5,2,25);
INSERT INTO first(zoom,x,y) VALUES(5,4,45);
INSERT INTO first(zoom,x,y) VALUES(5,7,34);
INSERT INTO first(zoom,x,y) VALUES(5,45,40);
INSERT INTO first(zoom,x,y) VALUES(5,72,63);
INSERT INTO second(zoom,x,y) VALUES(5,2,25);
INSERT INTO second(zoom,x,y) VALUES(5,4,45);
INSERT INTO second(zoom,x,y) VALUES(5,7,34);

Wanted result:

In table first there are extra rows:
5,45,40
5,72,63

Edit

Sorry about this, but I have figured out now that my original data is pretty much complexed then the sample I provided. So in the original data, table first consists of 900 rows, and table second of 935 rows. I assumed that the rows are distinct in each table, however as I am not sure now, so I would like to include this condition in a query. I assumed query would return 35 rows as difference, because I was pretty much convinced that all zoom/x/y would be same except this 35 one. However, it might now be the case. So basically what I need to know is what are differences between two tables, whatever approach is the best to solve it.

Can I get something like this:

 zoom | x  | y  | first |second
------+----+--- +-------+------
   5  | 45 | 40 |  yes  |  no |

order by first yes, second no

 zoom | x  | y  | first |second
------+----+--- +-------+------
   5  | 45 | 40 |  yes  |  no |
   5  | 45 | 40 |  yes  |  no |
   5  | 45 | 40 |  yes  |  no |

then first no, second yes

 zoom | x  | y  | first |second
------+----+--- +-------+------
   5  | 45 | 40 |  no   |  yes |
   5  | 45 | 40 |  no   |  yes |
   5  | 45 | 40 |  no   |  yes |

Solution

  • As you want to compare all columns of both tables, you can use a full outer join on all columns and the check if one of them is:

    select case 
              when f.zoom is null then 'missing in first'
              when s.zoom is null then 'missing in second'
           end as status, 
           zoom, x, y
    from "first" f
      full outer join second s using (zoom, x, y)
    where f.zoom is null or s.zoom is null;
    

    The join based on using() will return those columns that are not null (and only those columns - removing the duplicate columns from the result)

    When using your sample data from the question, the result would be:

    status            | zoom | x  | y 
    ------------------+------+----+---
    missing in second |    5 | 45 | 40
    missing in second |    5 | 72 | 63
    

    If a row in the second table is added that doesn't exist in the first, e.g.:

     INSERT INTO second(zoom,x,y) VALUES(15,7,34);
    

    then the result will be:

    status            | zoom | x  | y 
    ------------------+------+----+---
    missing in second |    5 | 45 | 40
    missing in second |    5 | 72 | 63
    missing in first  |   15 |  7 | 34