Search code examples
mysqlsqlnatural-key

Can I use a non-numerical primary key for a MySQL table?


In my web application, the user can define documents and give them a unique name that identifies that document and a friendly name that a human will use to refer to the document. Take the following table schema as an example:

|   id   |    name        |   friendly_name   |
-----------------------------------------------
|    2   |    invoice-2   |   Invoice 2       |

In this example I've used the id column as the primary key, which is an auto incrementing number. Since there's already a natural ID for documents (name) I could also do this:

|    name        |   friendly_name   |
--------------------------------------
|    invoice-2   |   Invoice 2       |

In this example, name is the primary key of the document. We've eliminated the id field as it's essentially just a duplicate of name, since every document in the table must have a unique name anyway.

This would also mean that when I refer to a document from a foreign key relationship I'd have to call it document_name rather than document_id.

What's the best practice regarding this? Theoretically it's entirely possible for me to use a VARCHAR for the primary key, but does it come with any downsides such as performance overhead?


Solution

  • There are two schools of thought on this topic.

    There are some who hold strongly to the belief that using a "natural key" as the primary key for an entity table is desirable, because it has significant advantages over a surrogate key.

    The are others that believe that a "surrogate" key can provide some desirable properties which a "natural" key may not.

    Let's summarize some of the most important and desirable properties of a primary key:

    • minimal - fewest possible number of attributes
    • simple - native datatypes, ideally a single column
    • available - the value will always be available when the entity is created
    • unique - absolutely no duplicates, no two rows will ever have the same value
    • anonymous - carries no hidden "meaningful" information
    • immutable - once assigned, it will never be modified

    (There are some other properties that can be listed, but some of those properties can be derived from the properties above (not null, can be indexed, etc.)


    I break the two schools of thought regarding "natural" and "surrogate" keys as the "best" primary keys into two camps:

    1) Those who have been badly burned by an earlier decision to elect a natural key as the primary key, and

    2) Those who have not yet been burned by that decision.