Search code examples
mysqlinnodbmyisamlarge-datadatabase

Mysql MyISAM table crashes with lots of inserts and selects, what solution should I pick?


I have the following situation:

MySQL MyISAM database on Amazon EC2 instance with PHP on a apache webserver. We need to store incomming packages in json in MySql. For this I use a staging database where a cronjob each minutes moves old data with a where DateTime > 'date - 2min' query to another table (named stage2).

The stage1 table has only actual information and contains 35k rows at normal and can contain up to 100k when it's busy. We can reach 50k new rows a minute, which should be about 10k insert queries. The insert looks like this:

INSERT DELAYED IGNORE INTO stage1 VALUES ( ... ), (....), (....), (....), (...)

Then we have 4 scripts running about each 10second doing the following:

  1. grab max RowID from stage1 (primary key)
  2. export data till that rowID and from the previous max RowId a) 2 scripts are in bash and using the mysql export commandline method b) 1 script in node.js and is using the export method with into outfile c) 1 script in php which using the default mysql select statement and loop through each row.
  3. send data to external client
  4. write last send time and last rowid to a mysql table so it knows where it is next time.

Then we have one cronjob each minute moving old data form stage1 to stage2.

So everything worked well for a long time but now we are increasing in users and during our rush hours the stage1 table is crashing now and then. We can easily repair it but that's not the right way because we will be down for some time. Memory and CPU are ok during the rush hours but when stage1 is crashing everything is crashing.

Also worth to say: I don't care if I'm missing rows because of a failure, so I don't need any special backup plans just in case something went wrong.

What I did so far:

  1. Adding delayed and ignore to the insert statements.
  2. Tried switching to innoDB but this was even worse, mainly think of the large memory it needed. My EC2 currently is a t2.medium which has 4gb memory and 2 vCPU with burst capacity. Following: https://dba.stackexchange.com/questions/27328/how-large-should-be-mysql-innodb-buffer-pool-size and running this query:

    SELECT CEILING(Total_InnoDB_Bytes*1.6/POWER(1024,3)) RIBPS FROM (SELECT SUM(data_length+index_length) Total_InnoDB_Bytes FROM information_schema.tables WHERE engine='InnoDB') A;

    it returned 11gb, I tried 3gb which is the max for my instance (80%). And since it was more instable I switched every table back to myISAM yesterday

  3. recreate the stage1 table structure

What are my limitations?

  1. I cannot change all 4 scripts to one export because the output to the client is different. for example some use json others xml.

Options I'm considering

  1. m3.xlarge instance with 15GB memory is 5 times more expensive, but if this is needed Im willing to do the offer. Then switch to innoDB again and see if it's stable ?
  2. can I just move stage1 to innoDB and run it with 3gb buffer pool size? So the rest will be myISAM ?
  3. Try doing it with a nosql database or a in memory type database. Should that work?
  4. Queue the packages in memory and have the 4 scripts get the data from memory and save all later when done to mysql. Is there some kind of tool for this?
  5. Move stage1 to a RDS instance with innoDB

Love to get some advice and help on this! Perhaps I'm missing the easy answer ? or what options should I not consider.

Thanks, Sjoerd Perfors


Solution

  • Today we fixed these issues with the following setup:

    • AWS Loadbalancer going to a T2.Small instance "Worker" where Apache en PHP handeling the request and sending to a EC2 instance mysql system calling the "Main".
    • When CPU of the T2.small instance is above 50% automatically new instances are launched connecting to the loadbalancer.
    • "Main" EC2 has mysql running with innodb.
    • All updated to Apache 2.4 and php 5.5 with performance updates.
    • Fixed one script acting a lot faster.
    • Innodb has now 6GB

    Things we did try-out but didn't worked: - Setting up a DynamoDB but sending to this DB did cost almost 5seconds.

    Things we are considering: - Removing the stage2 database and doing backups directly from Stage1. Seems having this kind of rows isn't bad for the performance.