Search code examples
mysqlsqldatabaseperformancedatabase-normalization

SQL: Tips on structuring table to fit 3d data?


I would just like some guidance on database optimization. I have what is essentially a daily survey with a couple sliders (so float 0-1). Each day, X amount of people from the same company will fill this out, submitting their values for this "survey".

What I understand is that you should not create tables for each user as is it inefficient and unmanageable. So, I should not create a table for each company.

What I am looking for is guidance on putting all this data into one table. Here are some specific questions:

  • If each company has a 5-digit ID, and I were to record all data in the same table with a column showing id and a column with a timestamp, how efficient would it be getting all data points from a certain company (aka 5 digit ID) when the data gets into the millions or tens of millions?
  • How much does the size of a table affect the efficiency when accessing?
  • If I were to use a method with the 5-digit ID, would you recommend that I also assign every single "survey entry" (aka row) with its own index, per se?
  • Do you have any general tips or ideas for structuring a single table with my particular case?

Thanks.


Solution

  • It's not advisable to use a single table for all your data.

    I think you could model it with few tables (schem, a in Relational notation):

    Users(ID,name, bla, bla, bla) --user's data
    Company(ID, Name, bla, bla) --company's data
    
    question(ID,question) --a single question
    survey(ID, date, some general stuff about that survey) --track each survey    
    answer(ID, question_id, survey_id, user_id, company_id, answer_value) --an answer to a question (in a survey)
    

    NOTE: foreign keys are indicated as referenced_table_ID

    With this schema you can:

    • give same survey to multiple companies,
    • have a single user taking same survey with different values from different companies
    • track each individual question
    • track each individual answer
    • question do not have data replication of course, once done, you can
    • of course you can write a view to have all in a single table