Search code examples
sqldatabaseforeign-keys

Why are foreign keys more used in theory than in practice?


When you study relational theory foreign keys are mandatory. But in practice, in every place I worked, table products and joins are always done by specifying the keys explicitly in the query, instead of relying on foreign keys in the DBMS.

This way, you could join two tables by fields that are not meant to be foreign keys, having unexpected results.

Why is that?

Shouldn't DBMSs enforce that Joins and Products be made only by foreign keys?

The main reason for FKs is reference integrity. But if you design a DB, relationships in the model (arrows in the ERD) become foreign keys. (N:M relationships become separate tables.) Whether or not you define them as such in your DBMS, they're semantically FKs.

I can't imagine the need to join tables by fields that aren't FKs; what is an example?


Solution

  • The reason foreign key constraints exist is to guarantee that the referenced rows exist.

    "The foreign key identifies a column or a set of columns in one table that refers to a column or set of columns in another table. The values in one row of the referencing columns must occur in a single row in the referenced table.

    Thus, a row in the referencing table cannot contain values that don't exist in the referenced table (except potentially NULL). This way references can be made to link information together and it is an essential part of database normalization." (Wikipedia)


    RE: Your question: "I can't imagine the need to join tables by fields that aren't FKs":

    When defining a Foreign Key constraint, the column(s) in the referencing table must be the primary key of the referenced table, or at least a candidate key.

    When doing joins, there is no need to join with primary keys or candidate keys.

    The following is an example that could make sense:

    CREATE TABLE clients (
        client_id       uniqueidentifier  NOT NULL,
        client_name     nvarchar(250)     NOT NULL,
        client_country  char(2)           NOT NULL
    );
    
    CREATE TABLE suppliers (
        supplier_id       uniqueidentifier  NOT NULL,
        supplier_name     nvarchar(250)     NOT NULL,
        supplier_country  char(2)           NOT NULL
    );
    

    And then query as follows:

    SELECT 
        client_name, supplier_name, client_country 
    FROM 
        clients 
    INNER JOIN
        suppliers ON (clients.client_country = suppliers.supplier_country)
    ORDER BY
        client_country;
    

    Another case where these joins make sense is in databases that offer geospatial features, like SQL Server 2008 or Postgres with PostGIS. You will be able to do queries like these:

    SELECT
        state, electorate 
    FROM 
        electorates 
    INNER JOIN 
        postcodes on (postcodes.Location.STIntersects(electorates.Location) = 1);
    

    Source: ConceptDev - SQL Server 2008 Geography: STIntersects, STArea

    You can see another similar geospatial example in the accepted answer to the post "Sql 2008 query problem - which LatLong’s exists in a geography polygon?":

    SELECT 
        G.Name, COUNT(CL.Id)
    FROM
        GeoShapes G
    INNER JOIN 
        CrimeLocations CL ON G.ShapeFile.STIntersects(CL.LatLong) = 1
    GROUP BY 
        G.Name;
    

    These are all valid SQL joins that have nothing to do with foreign keys and candidate keys, and can still be useful in practice.