database database-design entity relational-database rdbms

Advantages of transforming IS A relationships

I would like to know what are the advantages/performance impacts when we transform a IS_A hierarchy to relations. Is it better to transform to keep the 3 tables (or) using seperate tables for Faculty and student?And also if (X,Y) is a key of a relation.Can either of them be a super key of relation ?

Person(Pid,name,age) Faculty(Pid,rank) Student(Pid,gpa)

Solution

Many times through the years it has occurred to me that best designs for prevailing theory and best designs for practical application are moving further and further apart. The design you show is a poor one in that it is susceptible to data anomalies. Such as: nothing prevents the PID of a faculty member from being entered into the Student table and vice versa.

There must be a way to specify that a PID is that of faculty or student (or both if that is allowed). The the Faculty and Student tables must be designed to adhere to that specification.

Fortunately, that is not difficult. Call it a hybrid intersection or cross table that comes between the main entity and the derived entities. This not only connects the derived entity to the main entity, but also defines the type of derivation. Here is a minimal definition:

create table FacultyOrStudent(
  PID        int  not null references Person( PID ),
  PersonType char( 1 ) check( PersonType in ( 'F', 'S' )),
  constraint PK_FacultyOrStudent primary key( PID, PersontType )
);

There could well be other fields like a date that the person joined the faculty or student body.

The PK allows the same person to be both a faculty member and student. If that is not allowed, the PK would be the PID field alone. However, in that case, (PID, PersonType) would be defined as unique. I'll elaborate below.

Unlike a standard intersection table, the only foreign key is the PID back to the Person table or the main entity. It cannot also be a FK to the derived entity as that is defined in different tables. However, nothing prevents us from having it the target of a FK reference from those other tables. Thus the defining of (PID, PersonType) as either the PK or as unique.

Here, then, are the derived entities:

create table FacultyPerson(
  PID          int not null primary key,
  FacultyType  char( 1 ) check( FacultyType = 'F' ),
  Rank         ranktype,
  constraint FK_FacultyToDefinition foreign key( PID, FacultyType )
    references FacultyOrStudent( PID, PersonType )
);

create table StudentPerson(
  PID          int not null primary key,
  StudentType  char( 1 ) check( StudentType = 'S' ),
  GPA          gpatype,
  constraint FK_StudentToDefinition foreign key( PID, StudentType )
    references FacultyOrStudent( PID, PersonType )
);

The same PID cannot be used more than once as either a faculty member or student. Most importantly, it is not possible to add the PID to the FacultyPerson or StudentPerson tables that is not previously defined as a faculty member or student, respectively, in the FacultyOrStudent table.

To make the work of the application developers easier (and because my personal rule is not to all apps direct access to tables), create two views which provides all the faculty data and student data.

create view Faculty as
  select  f.PID, p.name, p.age, fp.Rank
  from    FacultyPerson fp
  join    FacultyOrStudent fos
      on  fos.PID = fp.PID
      and fos.PersonType = fp.FacultyType
  join    Person p
      on  p.PID = fos.PID;

create view Students as
  select  sp.PID, p.name, p.age, sp.GPA
  from    StudentPerson sp
  join    FacultyOrStudent fos
      on  fos.PID = sp.PID
      and fos.PersonType = sp.StudentType
  join    Person p
      on  p.PID = fos.PID;

This allows the apps to access the data in the form they most need. Triggers on the views also allow all DML operations in that same form. The apps don't need to know what actual form the underlying data is in. This gives the database developers the added convenience of being free to change the underlying data without worrying about the impact on the apps. Just change the views appropriately.

The names of the objects I use are for illustration only. Naming is per personal preference and/or corporate rules.

I also hardcoded the 'F' and 'S' values. Again, for illustration. These would much better be placed in their own lookup table with the field in FacultyOrStudent as a FK. This allows for scalability. To add other types of staff, Secretarial (S) or Custodial (C) or Maintenance (M) or whatever, just add the definition(s) to the lookup table and create the needed table(s) and view(s).

In short, do not transform. Keep the tables and add whatever other tables may be needed to maintain strict data integrity. Data integrity is your top priority in database design.