I have a MySQL table which contains 20 millions of rows and 2 columns and structure like below.
item_id (unsigned int[11] PK) | item_name (varchar[50]) | image_path (varchar[50])
Should I separate image_path
column to another table, like
image_id | image_path
What will be performance gain /loss? Will join requests faster than select * requests for only one table on this large data set?
My database only one table for now. If I normalize it will be two.
and my requests will be like
SELECT items.item_name, images.image_path
FROM items
LEFT JOIN images on images.image_id = items.image_id
WHERE item_id = 123
and I'm currently using
SELECT item_name, image_path FROM items WHERE item_id = 123
another hard option for me is MySQL Partitioning my table with RANGE Partitioning
What is the best way for performance?
In the query you show, you look up one row in items
by primary key value 123, and then join to one row in images
by primary key value. That's two primary key value lookups, when you could have done one.
In MySQL's default storage engine, InnoDB, a NULL
value doesn't take any storage space. So I would suggest if some of your item rows have an image_path and some don't, then just define image_path as nullable. That's bound to be better for performance.
You also ask about PARTITION BY RANGE
. With respect, 20 million rows is not a big table.
In the databases I manage, I partitioned one table by date because it has 4.7 billion rows (i.e. 235x the size of your table). There are quite a few other tables that have millions of rows, but they are fine without being partitioned.
As long as you use the index or primary key when you do lookups, you should be okay.