Search code examples
mysqldatabasecsvdoctrine-ormdatabase-normalization

Doctrine/MySQL performance: 470 columns in one table


I have an external csv file that I need to import into the MySQL database: the csv has 473 columns (144k rows) which in my opinion is too much columns for one single table.

The problem: I was thinking of doing some normalization and split data into more tables but this will require extra work whenever a new csv is released (with more or less columns).

Is it okay if I keep the structure of the CSV/Table intact and have a big table? what are the performance impact of both approaches on MySQL/Doctrine?

The data: I don't have ownership of this data to split it onto more tables: this data comes from government public resources as it is: no column duplicates.. so there's no way to split it :( I must take it as it is... Any additional categorization/splitting is overwork and may change on the next update of data.


Solution

  • Digging deep into the CSV data I found some interesting kind of organization: it can be split into 18 different tables (providers).

    Each table has its own columns (some columns exist in multiple tables) but the largest one is around 180 column..

    This is so far how I can split the data: since I don't have ownership of the CSV I cannot go ahead and just group similar columns/tables.