Search code examples
databaseprimary-keyrdbmscandidate-key

RDBMS keys confusion


I have some confuse about candidate key, primary key, mininal superkey.

By definition:

Primary key: only 1 attribute , so it must be minimal superkey and it's also a candidate key(because its minial superkey)

First conclusion: If a key is primary key, then it also a candidate key and a minimal superkey

The following point I think it's true, but I am not sure. Can someone confirm with me together?

  1. If it's minimal superkey, it doesn't mean it's a primary key. But it means it's candidate key.

  2. If it's candidate key, it doesn't neccessarily to be mininal key, also it doesn't need to be primary key.

Conclusion:

  1. primary key: only 1 value, can identify whole row. It also candidate key and minimal

  2. minimal superkey: 1 value or combination of fields can identify whole rows, so it's a candidate key but not necessary primary. But if takes out any one field, then it's no longer a key

  3. Candidate key: 1 value or combination of fields can identify whole rows but don't have to be minimal or primary.


Solution

  • The short answer is much simpler than you describe. A key (also variously known as a candidate key or primary key or secondary key or alternate key) is a minimal superkey. In other words a key is a set of attributes which are guaranteed to be irreducibly unique and never null. Keys may consist of one or more attributes and sometimes zero attributes (zero attribute keys are relatively unusual and not often discussed but are still useful as keys).

    A longer explanation:

    By convention and for convenience one key per relation is usually designated to be a "primary" key, meaning it is deemed to have special significance in some way. The special significance could be anything - typically it is either a key that is referenced as a foreign key in other relations or the key which is the "preferred" identifier for that relation.

    It is important to understand however that a "primary" key is fundamentally no different to any other key unless you choose to make it so (or are forced to by the limitations of your DBMS software). In principle therefore there is no absolute reason why you must always designate one "primary" key. Assuming every relation has at least one candidate key (which by definition it must) then you might just as well call zero, one or more of those keys "primary". It is however a very strong convention to designate exactly one and only one primary key and that is the convention supported by the syntax of SQL-based DBMSs.

    As a matter of historical detail it should be noted that E.F.Codd, inventor of the Relational Model, originally used the term Primary Key to mean any and all the keys of a relation and not just one. In modern usage however, the term candidate key is used for what were originally called the primary keys, while the designation of "primary" is used in the way I've described above. Unfortunately there exists a great deal of confusion on this point, and you will quite often see claims that "primary" keys are somehow different from other candidate keys in some arbitrary way.

    Here is Hugh Darwen on the subject of keys:

    a relvar can have several keys, but we choose just one for underlining and call that one the primary key. The choice is arbitrary, so the concept of primary is not really very important from a logical point of view. The general concept of key, however, is very important! The term candidate key means exactly the same as key (i.e., the addition of candidate has no real significance—it was proposed by Ted Codd because he regarded each key as a candidate for being nominated as the primary key).

    http://www.dcs.warwick.ac.uk/~hugh/M359/What-Is-a-Key.html