We have a mysql table called posts_content.
The structure is as follows :
CREATE TABLE IF NOT EXISTS `posts_content` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`post_id` int(11) NOT NULL,
`forum_id` int(11) NOT NULL,
`content` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=79850 ;
The problem is that the table is getting pretty huge. Many giga-bytes of data ( we have a crawling engine ).
We keep inserting data into the table on a daily bases but seldom do we retrieve the data. Now as the table is getting pretty huge its getting difficult to handle the table.
We discussed two possibilities
I hope I have clearly explained the problem. WHat I need to know is which of the above two would be a better solution in the long run. What are the adv. dis adv. of both the cases.
Thanking you
The difference is that in the first case you leave MySQL to do the sharding, and in the second case you are doing it on your own. MySQL won't scan any shards that do not contain the data, however if you have a query WHERE forum_id IN(...)
it may need to scan several shards. As far as I remember, in that case the operation is syncronous, e.g. MySQL queries one partition at a time, and you may want to implement it asyncronously. Generally, if you do the partitioning on your own, you are more flexible, but for simple partitioning, based on the forum_id, if you query only 1 forum_id at a time, MySQL partitioning is OK.
My advice is to read the MySQL documentation on partitioning, especially the restrictions and limitations section, and then decide.