I have a Users
table with cars ownership represented as array of integers (foreign keys). Personally, it is much easier to use array data type instead of normalizing this table to 2NF as in my business logic I can iterate/fetch each car separately by querying Cars
table whenever needed. However, my focus here is not on the implementation details but on the best practices and potential pitfalls of using array data types in RDBMs systems.
Users table
╔════════╦══════╦═════════╗
║ userId ║ name ║ carsOwn ║
╠════════╬══════╬═════════╣
║ 1 ║ Jon ║ {2} ║
║ 2 ║ Yang ║ {1,3,4} ║
║ 3 ║ Ali ║ {1,4} ║
║ 4 ║ Kate ║ {1,2} ║
╚════════╩══════╩═════════╝
Cars table
╔═══════╦══════════╦
║ carId ║ name ║
╠═══════╬══════════╬
║ 1 ║ honda ║
║ 2 ║ toyota ║
║ 3 ║ smart-car║
║ 4 ║ bentley ║
╚═══════╩══════════╩
1NF (First Normal Form)
Each table cell should contain a single value. Each record needs to be unique.
2NF (Second Normal Form) Rules
Be in 1NF Single Column Primary Key
The important normal form here is the first one.
The SQL standard 1999 has somewhat widened the perspective here by introducing compound data types. The best way to look at it is that a datum is atomic if it is atomic in the context of the data model. Otherwise you'd have to consider each string non-atomic (it consists of characters!) and split it up.
So I'd say that there is nothing wrong in using arrays as long as you don't need to fetch individual array items in the database or — horribile dictu — perform joins with array items.
Now this is exactly what you want to do, so you should definitely normalize the data model. Your queries will be much simpler and probably faster too.