Search code examples
mysqlnode.jsoptimizationindexingdatabase-partitioning

MYSQL: need help for rapidly growing table and decreasing speed (4mio rows)


I'm facing some issues with a rapidly growing table at increasing speed (currently 4mio rows, 300k inserts a day). I hope I can get some ideas and advices here to improve my setup and squeeze the last bit out of my box, before it takes down my website in near future.

The setup:

    Intel i7 720 
    8GB RAM
    2x750GB SATA RAID 0
    CentOS
    MySQL 5.5.10
    Node.js + node-lib_mysql-client

The table definition:

CREATE TABLE IF NOT EXISTS `canvas` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`x1` int(11) NOT NULL,
`y1` int(11) NOT NULL,
`x2` int(11) NOT NULL,
`y2` int(11) NOT NULL,
`c` int(4) unsigned NOT NULL,
`s` int(3) unsigned NOT NULL,
`m` bigint(20) unsigned NOT NULL,
`r` varchar(32) NOT NULL,
PRIMARY KEY (`id`,`x1`,`y1`) KEY_BLOCK_SIZE=1024,
KEY `x1` (`x1`,`y1`) KEY_BLOCK_SIZE=1024,
KEY `x2` (`x2`,`y2`) KEY_BLOCK_SIZE=1024
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT KEY_BLOCK_SIZE=4
/*!50100 PARTITION BY HASH ( (
(
x1 MOD 10000
)
) + y1 MOD 10000)
PARTITIONS 10 */ AUTO_INCREMENT=13168904 ;

The query:

SELECT x1,y1,x2,y2,s,c,r,m FROM canvas
WHERE 1 AND ((
 x1 >= 0
 AND x1 <= 400
 AND y1 >= 0
 AND y1 <= 400
 ) OR ( 
 x2 >= 0
 AND x2 <= 400
 AND y2 >= 0
 AND y2 <= 400
 ) )
  ORDER BY id desc

That's the only query I'm executing, except for the fact that the values for x1,y1,x2 and y2 change per query. It's a 2D canvas and each row represents a line on the canvas. Guess it's also important to know that the maximum range selected for 1 field is never bigger than 1200 (pixels). A few weeks ago I upgraded to MySQL 5.5.10 and started using partitions. The 'x1 % 10000' hashw as my first and unaware approach to get into the partition topic. It already gave me a decent boost in SELECT speed, but I'm sure there's still room for optimizations.

Oh, and before you ask... I'm aware of the fact that I'm using a MyISAM table. A friend of mine suggested innoDB, but tried it already and the result was a 2 times bigger table and a big drop in SELECT performance. I don't need no fancy transactions and stuff.... all I need is the best possible SELECT performance and a decent performance with INSERTs.

What would you change? Could I perhaps tweak my indexes somehow? Does my partion setup make any sense at all? Should I perhaps increase the number of partition files?

All suggestions are welcome... I also discussed a local replication into a memory table with a friend, but I'm sure it's only a matter of time until the table size would exeed my RAM and a swapping box is a fairly ugly thing to see.

When you think about my issue please keep in mind that it's growing rapidly and unpredictably. In case it goes viral somewhere for some reason, I expect to see more than 1mio INSERTS a day.

Thank you for reading and thinking about it. :)

EDIT: The requested EXPLAIN result

select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
SIMPLE  canvas  index_merge     x1,x2   x1,x2   8,8     NULL    133532  Using sort_union(x1,x2); Using where; Using fileso...

EDIT2: The requested my.cnf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

innodb_buffer_pool_size = 1G
sort_buffer_size = 4M
read_buffer_size = 1M
read_rnd_buffer_size = 16M
innodb_file_format = Barracuda

query_cache_type = 1
query_cache_size = 100M

# http://dev.mysql.com/doc/refman/5.5/en/performance-schema.html
;performance_schema


[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

The innoDB values are for my innoDB try... guess they are not necessary anymore. The sever runs 4 other Websites as well, but they are rather small and not really worth to mention. I'm gonna move this project to a dedicated box soon anyways. Your ideas can be radical - I don't mind experiments.

EDIT3 - BENCHMARKS WITH INDEXES

Ok guys... I've made some benchmarks with different indexes and the results are pretty good so far. For this benchmark I've was selecting all rows within a box of 2000x2000 pixels.

SELECT SQL_NO_CACHE x1,y1,x2,y2,s,c FROM canvas_test WHERE 1 AND (( x1 BETWEEN -6728 AND -4328 AND y1 BETWEEN -6040 AND -4440 ) OR (  x2 BETWEEN -6728 AND -4328 AND y2 BETWEEN -6040 AND -4440 ) )  ORDER BY id asc

Using the table/index definition I've posted above the avarage query time was: 1740ms

Then I dropped all indexes, except for the primary key -> 1900ms

added one index for x1 -> 1800ms

added one index for y1 -> 1700ms

added one index for x2 -> 1500ms

added one index for y2 -> 900ms!

That's quite astonishing so far... for some reason I was thinking making combined indexes for x1/y1 and x2/y2 would make sense somehow, but actually it looks like I was wrong.

EXPLAIN now returns this:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  canvas_test     index_merge     x1,y1,x2,y2     y1,y2   4,4     NULL    263998  Using sort_union(y1,y2); Using where; Using fileso..

Now I'm wondering why it's using y1/y2 as keys and not all four?

However, I'm still looking for more ideas and advices, especially regarding partitions and proper hashing.


Solution

    1. How much memory is your server currently utilizing?
    2. Is this the only database/table on the server?
    3. Are you using MyISAM exclusively?

    MyISAM is okay to use, so long as you're not updating your rows. When you update a row on a MyISAM table MySQL locks the entire table, blocking any SELECTs and INSERTS from executing until the UPDATE is complete. UPDATE has precedence over SELECT, so if you have a lot of UPDATEs running, your SELECTS will wait until they're all complete before they return any rows.

    If that is okay with you, then move to your server configuration. What does your my.cnf file look like? You'll want to optimize this file to maximize the amount of memory you can use for indexes. If these SELECTs are slowing down, it's because your table indexes are not fitting in memory. If MySQL can't fit your table indexes into memory, then it has to go to disk and do a table scan to fetch your data. This will kill performance.

    EDIT 5/18/2011 9:30PM EST

    After looking at your my.cnf, I notice you have zero MyISAM optimizations in place. Your starting place is going to be the key_buffer_size variable. This variable is, as a rule of thumb, set somewhere between 25% and 50% of the total available memory on your system. Your system has 8GB memory available, so somewhere around 3GB is a minimum starting point, I'd say. However, you can estimate how much you will need and optimize it as needed if you know you have control over the other variables on the system.

    What you should do is cd to your mysql data dir (typically /var/lib/mysql) which is where all your data files are located. A quick way to tell how much index data you have is to do

     sudo du -hc `find . -type f -name "*.MYI"
    

    This command will look at the size of all your MyISAM Index files and tell you their total size. If you have enough memory, you want to make your key_buffer_size in your my.cnf BIGGER than the total size of all your MYI files. This will ensure that your MyISAM indexes are in memory, so MySQL won't have to hit the disk for the index data.

    A quick note, don't go increasing your key_buffer_size willy nilly. This is just one area of MySQL that needs memory, there are other moving parts that you need to balance memory usage with. MySQL connections take up memory, and different table engines use different memory pools for their indexes, and MySQL uses other bits of memory for different things. If you run out of memory because you set the key_buffer_size too large, your server could start paging (using virtual memory, which will KILL performance even MORE) or worse, crash. Start with smaller values if you're unsure, check your memory usage, and increase it until you're satisfied with the performance, and your server isn't crashing.