Search code examples
database-normalization

Is this table in first normal form?


I am currently studying SQL normal forms.

Lets say I have the following table the primary key is userid

userid    FirstName    LastName    Phone
1         John         Smith       555-555
1         Tim          Jack        432-213
2         Sarah        Mit         454-541
3         Tom          jones       987-125

The book I'm reading states the following conditions must be true in order for a table to be in 1st normal form.

  1. Rows contain data about an entity.
  2. Columns contain data about attributes of the entities.
  3. All entries in a column are of the same kind.
  4. Each column has a unique name.
  5. Cells of the table hold a single value.
  6. The order of the columns is unimportant.
  7. The order of the rows is unimportant.
  8. No two rows may be identical.
  9. A primary key Must be assigned

I'm not sure if my table violates the 8th rule No two rows may be identical.

Because the first two records in my table

1         John         Smith       555-555
1         Tim          Jack        432-213

share the same userid does that mean that they are considered duplicate rows?

Or does duplicate records mean that every peace of data in the row has to be the same for the record to be considered a duplicate row see example below?

1         John         Smith       555-555
1         John         Smith       555-555

EDIT1: Sorry for the confusion

The question I was trying to ask is simple

Is this table below in 1st normal form?

userid    FirstName    LastName    Phone
1         John         Smith       555-555
1         Tim          Jack        432-213
2         Sarah        Mit         454-541
3         Tom          jones       987-125

Based on the 9 rules given in the textbook I think it is but I wasn't sure that

if rule 8 No two rows may be identical

was being violated because of two records that use the same primary key.

The class text book and prof isn't really that clear on this subject which is why I am asking this question.


Solution

  • Or does duplicate records mean that every peace of data in the row has to be the same for the record to be considered a duplicate row see example below?

    They mean that--the latter of your choices. Entire rows are what must be "identical". It's ok if two rows share the same values for one or more columns as long as one or more columns differ.

    That's because a relation holds a set of values that are tuples/rows/records, and set is a collection of values that are all different.

    But SQL & some relational algebras have different notions of "identical" in the case of NULLs compared to the relational model without NULLs. You should read what your textbook says about it if you want to know exactly what they mean by it. Two rows that have NULL in the same column are considered different. (Point 9 might be summarizing something involving NULLs. Depends on the explanation in the book.)


    PS

    There's no single notion of what a relation is. There is no single notion of "identical". There is no single notion of 1NF.

    Points 3-8 are better described as (poor) ways of restricting how to interpret a picture of a table to get a relation. Your textbook seems to be (strangely) making "1NF" a property of such an interpretation of a picture of a table. Normally we simply define a relation to be a certain thing so if you have one then it has to have the defined properties. Then "in 1NF" applies to a relation & either means "is a relation" & isn't further used or it means certain further restrictions hold. A relation is a set of tuples/rows/records, and in the kind of relation your 3-8 describes they are sets of attribute/column/field name-value pairs & the values paired with a name have to be of the type paired with that name in some schema/heading that is a set of name-type pairs that is defined either as part of the relation or external to it.

    Your textbook doesn't seem to present things clearly. It's definition of "1NF" is also idiosyncratic in that although 3-8 are mathematical, 1 & 2 are informal/heuristic (& 9 could be either or both).