Why do we use PRIMARY_KEY_AUTO_INCREMENT
in SQL and not just AUTO_INCREMENT
for an id in a table?
If we just set the data type of the id to be AUTO_INCREMENT
, won't that automatically increment it, therefore generating a unique id each time? Thus, what is the point of setting PRIMARY_KEY
(which uniquely identifies each item in the table) on top?
Because of constraints, because SQL is a declarative language, and because it is self-documenting.
AUTO_INCREMENT
and PRIMARY KEY
do two very different things.
AUTO_INCREMENT
is not standard SQL, but most databases have something like it. In usually makes the default an incrementing integer. That's the mechanistic element of a primary key: an auto-incrementing integer is a decent unique id.
PRIMARY KEY
does a number of things.
not null
.unique
.Most are constraints on the value. It must exist (1), it must be unique (2), and it must be the only primary key (5). Constraints guarantee the data is as you expect. Without them you might accidentally insert multiple rows with the same primary key, or with no primary key. Or the table might have multiple "primary" keys, which do you use? With constraints you don't have to check the data every time you fetch it, you know it will within its declared constraints.
Indexing the primary key (3) is an optimization. You're probably going to be searching by primary key a lot. Indexes aren't part of the SQL standard. That might be surprising. They're a detail of implementing SQL. Indexes don't affect the result of the query, and the SQL standard avoids telling databases how they should do things. This is part of why SQL is so ubiquitous, it is declarative.
In a declarative language you don't say how to do it, you say what you want. A SQL query says what result you want and the database figures it out for you. You could implement most of the above as constraints and triggers, but if you did the database wouldn't understand why you're doing that and would not be able to use that to help you out. By stating to the database "this is the primary key" and it can handle that as it sees fit. The database adds the necessary constraints. The database adds its optimizations, which can be more than indexing. The database can use this information in queries to identify unique rows. You get the benefit of 50 years of database theory.
Finally, it lets people know the purpose of the column. Anyone can read the schema and know that column is the primary key. That anyone can be you six months from now.