Search code examples
sqlpostgresqlinner-join

PostgreSQL find missing attributes for each entity


I have two datasets attributes_type and entities.

attributes_type dataset contains multiple attributes_type_key's to each entity_type.

CREATE TABLE attributes_type (
    entity_type_key int,
  attributes_type_key serial PRIMARY KEY,
  name varchar,
  mandatory boolean
);
    
INSERT INTO attributes_type (entity_type_key, name, mandatory) VALUES
  (1, 'car_Name',  true),
  (1, 'Price',  true),
  (1, 'Location',  false),
  (2, 'Color_name',  true),
  (2, 'Color_code', false);

CREATE TABLE entities (
  entity_key int,
  entity_type_key int,
  attributes_type_key int,
  values varchar
);

INSERT INTO entities (entity_key, entity_type_key, attributes_type_key, values) VALUES
  (1, 1, 1, 'Ford'),
  (1, 1, 2, '25000'),
  (2, 1, 2, '30000'),
  (2, 1, 3, 'Los Angeles'),
  (3, 2, 4, 'Blue'),
  (3, 2, 5, 'xy1500'),
  (4, 1, 1, 'Toyota'),
  (4, 1, 3, 'New York'),
  (5, 2, 5, 'xc000'),
  (5, 2, 4, 'Nilkamal'),
  (7, 1, 3, 'london');

DB Fiddle

if attributes_type_key's has mandatory true then entity_key should have those mandatory attributes_type_key in entities dataset for that entity_type.

Example: 1 entity_type_key has name and price attribute_type mandatory=true. so if entities dataset has entity_key of entity_type_key=1 then attributes_type_key 1 and 2 must be there.

Now I want query entities table and find entity_key that doesn't have mandatory attributes_type_key . Example: entity_key 2 doesn't mandatory attributes_type_key 1 , similarly entity_key 4 and 7 have missing mandatory attributes_type_key.

query should return 2,4 and 7 entity_key

Could you help with query


Solution

  • As a starter: your data model should have a separate table to store the entities, along with their type. Your current schema redondently stores the entity type on each row of the mapping table (which should actually be called something like entites_attributes rather than just entities).

    One option is to extract distinct entities from the , and then bring the corresponding mandatory attributes with a join; we can then filter on the tuples that do not exist in the mapping table with not exists:

    select e.entity_key, a.*
    from (select distinct entity_key, entity_type_key from entities) e
    inner join attributes_type a on a.entity_type_key = e.entity_type_key
    where a.mandatory and not exists (
        select 1
        from entities ea
        where ea.entity_key = e.entity_key and ea.attributes_type_key = a.attributes_type_key
    )
    

    This generates one row for each missing entity/attribute tuple, so for your sample data:

    entity_key entity_type_key attributes_type_key name mandatory
    2 1 1 car_Name t
    4 1 2 Price t
    7 1 1 car_Name t
    7 1 2 Price t

    If you just want the entities, you can group by instead.

    fiddle