Might be a dummy question, but I'm creating a system in which users are not allowed to exceed X mounts of kb in pictures they can upload.
When uploading a picture I update the images
table with the size of the image in KB and other info.
Now, should I also keep track of the total size of each users images on the users
table? Or should I just do a select sum(size) from images where user = xxx
every time I want to check the limit? Which might with every new upload?
What would be the best approach from a relational point of view?
Storing the SUM in the users
table is one type of denormalization.
This can be worthwhile if you need to query the sum frequently, and it's too slow to do the aggregate query every time you need it.
But you accept the risk that the stored sum in the users
table will become out of sync with the real SUM(size)
of the associated images.
You wouldn't think this would be difficult, but in practice, there are lots of edge case where the stored sum fails to be updated. You will end up periodically running the aggregate query in the background, to overwrite the stored sum, just in case it has gotten out of sync.
Denormalization is more work for you as a coder, because you have to write code to correct for anomalies like that. Be conservative about how many cases of denormalization you create, because each one obligates you to do more work.
But if it's very important that your query for the sum return the result faster than is possible by running the aggregate query, then that's what you have to do.
In my experience, all optimizations come with a price.