Search code examples
database-designnormalizationjunction-table

Junction Table & Normalization Question


I am having a hard time trying to figure out if the following design pattern is acceptable. I have the following requirements (and some other more) for a relational model:

1) It must be able to represent applications (such as AppA, AppB, AppC), each one with it's own set of attributes.

2) Every applications can communicate through different channels like Internet (E-Mail, Twitter, Facebook), Phone (SMS, MMS, etc.) so that there's a many-to-many relationship between programs and channels.

3) There is a set of pre-defined identifiers (addresses, phone-numbers, login accounts) which can be shared by many programs, so that, again, there's a many-to-many relationship between programs and identifiers.

4) The same identifier can send several types of messages, and so can the programs (again, many-to-many), but I need to be able to restrict usage of identifiers for communications type on a per-application basis.

Basically, what I did was to create four tables, Program, Channel, Ident and CommunicationType to store information about each of these and, instead of creating junction tables for (Program, Channel), (Program, Identifier), and so on which would just complicate the design, I created a single table consisting of the primary keys of these four tables with a unique constraint on (Program, Channel, Ident, CommunicationType). Now, each record of this table is linked to a given communication.

Of course, this solves my problem in a pretty easy way, but now I am questioning myself whether this is acceptable at all of if it defeats the principles of normalization. Can anyone please give me an opinion?


Solution

  • Basically, what I did was to create four tables, Program, Channel, Ident and CommunicationType to store information about each of these and,

    That's a fine idea.

    instead of creating junction tables for (Program, Channel), (Program, Identifier), and so on which would just complicate the design, I created a single table consisting of the primary keys of these four tables with a unique constraint on (Program, Channel, Ident, CommunicationType).

    You need to be careful of one thing when you design tables like this. Your structure, which has the key {Program, Channel, Ident, CommunicationType}, allows every possible combination of Program and Channel, of Channel and Ident, of Program and CommunicationType, and so on. Sometimes that's a bad idea.

    The same identifier can send several types of messages, and so can the programs (again, many-to-many), but I need to be able to restrict usage of identifiers for communications type on a per-application basis.

    And that's what makes it a bad idea. You seem to be saying that not every combination of Ident, Program, and CommunicationsType is valid.

    Store valid combinations in their own tables. Use foreign key references to maintain data integrity.

    Build a table that has the key {Program, Ident, CommunicationsType}. The table that has the key {Program, Channel, Ident, CommunicationType} can set a foreign key reference to it.

    Build as many tables as it takes to implement all the constraints you know of. More tables means data integrity checks are simpler. (You might need more tables than the ones I mentioned. Don't assume they need to have two columns; they might need more.)

    It's not at all clear that you need a table keyed {Program, Channel}. But if you do, then you need to build tables something along these lines. (Air code.)

    create table pc (
        program_name varchar(10) not null references programs (program_name),
        channel_name varchar(10) not null references channels (channel_name),
        primary key (program_name, channel_name)
    );
    
    create table pict (
        program_name varchar(10) not null,
        channel_name varchar(10) not null,
        comm_type varchar(10) not null references communication_type (comm_type),
        primary key (program_name, channel_name, comm_type),
        foreign key (program_name, channel_name) 
            references pc (program_name, channel_name) 
    );
    
    create table your-table-name (
        program_name varchar(10) not null,
        channel_name varchar(10) not null,
        comm_type varchar(10) not null,
        ident varchar(10) not null,
        primary key (program_name, channel_name, comm_type, ident),
        foreign key (program_name, channel_name, comm_type) 
            references pict (program_name, channel_name, comm_type),
        foreign key (ident) references ident (ident)
    );
    

    Add other columns as needed. In some cases, you'll probably find that you need overlapping foreign keys. I don't think you need them here, but I could be wrong.

    I'm not sure what you mean by "if it defeats the principles of normalization". A table that has a four-column primary key doesn't violate any of the normal forms for that reason alone, although it might for other reasons. Failing to implement all the known constraints is generally, um, sub-optimal design, but not because it violates any of the normal forms.