Search code examples
mysqlperformanceinnodbpartitioning

MySQL Partitioning (innoDB) - Large table


I have a MySQL very large database (1 billion rows) like this:

database : products("name","caracteristics")

Both columns are VARCHAR(50).

actually, it have no KEY sat, but "name" will be unique, so I think I will alter it as "name" PRIMARY_KEY. (I should have done that before.. now I need to perform a remove duplicate query before adding primary_key option I guess)

My problem is, when performing a simple query on the table, it takes ages literally.

SELECT caracteristics WHERE name=blabla LIMIT 1; //takes ages.

I was thinking of partitioning the existing table.

So here are the question:

  • Is it a good idea to fix my performance issues?
  • How can I achieve that?
  • Is my idea of ALTER TABLE to set 'name' column as PRIMARY_KEY a good idea also?

  • also about the duplicate query, I found this around here, am I doing it properly? (don't want to mess up my table...)

delete a
from products a
left join(
select max(name) maxname, caracteristics
from products
group by caracteristics) b
on a.name = maxname and
a.caracteristics= b.caracteristics
where b.maxname IS NULL;

Solution

  • you can also direct set a PRIMARY KEY with the ignore option like this:

    ALTER IGNORE TABLE `products` ADD PRIMARY KEY(name);
    

    this will delete all duplicates from name.

    sample

    MariaDB [l]> CREATE TABLE `products` (
        ->   `name` varchar(50) NOT NULL DEFAULT '',
        ->   `caracteristics` varchar(50) DEFAULT NULL
        -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    Query OK, 0 rows affected (0.02 sec)
    
    MariaDB [l]> INSERT INTO `products` (`name`, `caracteristics`)
        -> VALUES
        ->     ('val1', 'asdfasdfasdf'),
        ->     ('val2', 'asdasDasd'),
        ->     ('val3', 'aesfawfa'),
        ->     ('val1', '99999999');
    Query OK, 4 rows affected (0.01 sec)
    Records: 4  Duplicates: 0  Warnings: 0
    
    MariaDB [l]> select * from products;
    +------+----------------+
    | name | caracteristics |
    +------+----------------+
    | val1 | asdfasdfasdf   |
    | val2 | asdasDasd      |
    | val3 | aesfawfa       |
    | val1 | 99999999       |
    +------+----------------+
    4 rows in set (0.00 sec)
    
    MariaDB [l]> ALTER IGNORE TABLE `products` ADD PRIMARY KEY(name);
    Query OK, 4 rows affected (0.03 sec)
    Records: 4  Duplicates: 1  Warnings: 0
    
    MariaDB [l]> select * from products;
    +------+----------------+
    | name | caracteristics |
    +------+----------------+
    | val1 | asdfasdfasdf   |
    | val2 | asdasDasd      |
    | val3 | aesfawfa       |
    +------+----------------+
    3 rows in set (0.00 sec)
    
    MariaDB [l]>
    

    test ADD PRIMARY KEY / INSERT IGNORE

    Here is a test between add Primary key and insert ignore into. and you can see that add Primary key (90 sec / 120 sec) is a little bit faster in this sample

    MariaDB [l]> CREATE TABLE `bigtable10m` (
        ->   `id` varchar(32) NOT NULL DEFAULT ''
        -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    Query OK, 0 rows affected (0.02 sec)
    
    MariaDB [l]>
    MariaDB [l]> INSERT INTO `bigtable10m`
        -> select lpad(seq,8,'0') from seq_1_to_10000000;
    Query OK, 10000000 rows affected (24.24 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
    
    MariaDB [l]>
    MariaDB [l]> SELECT * FROM `bigtable10m` LIMIT 10;
    +----------+
    | id       |
    +----------+
    | 00000001 |
    | 00000002 |
    | 00000003 |
    | 00000004 |
    | 00000005 |
    | 00000006 |
    | 00000007 |
    | 00000008 |
    | 00000009 |
    | 00000010 |
    +----------+
    10 rows in set (0.00 sec)
    
    MariaDB [l]>
    MariaDB [l]> CREATE TABLE `bigtable30m` (
        ->   `id` varchar(32) NOT NULL DEFAULT ''
        -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    Query OK, 0 rows affected (0.02 sec)
    
    MariaDB [l]>
    MariaDB [l]> INSERT INTO `bigtable30m` SELECT * FROM `bigtable10m`;
    Query OK, 10000000 rows affected (28.49 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
    
    MariaDB [l]> INSERT INTO `bigtable30m` SELECT * FROM `bigtable10m`;
    Query OK, 10000000 rows affected (29.01 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
    
    MariaDB [l]> INSERT INTO `bigtable30m` SELECT * FROM `bigtable10m`;
    Query OK, 10000000 rows affected (32.98 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
    
    MariaDB [l]>
    MariaDB [l]> ALTER IGNORE TABLE `bigtable30m` ADD PRIMARY KEY(id);
    Query OK, 30000000 rows affected (1 min 32.34 sec)
    Records: 30000000  Duplicates: 20000000  Warnings: 0
    
    MariaDB [l]>
    MariaDB [l]> DROP TABLE `bigtable30m`;
    Query OK, 0 rows affected (0.52 sec)
    
    MariaDB [l]>
    MariaDB [l]> CREATE TABLE `bigtable30m` (
        ->   `id` varchar(32) NOT NULL DEFAULT ''
        -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    Query OK, 0 rows affected (0.03 sec)
    
    MariaDB [l]>
    MariaDB [l]> INSERT INTO `bigtable30m` SELECT * FROM `bigtable10m`;
    Query OK, 10000000 rows affected (37.29 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
    
    MariaDB [l]> INSERT INTO `bigtable30m` SELECT * FROM `bigtable10m`;
    Query OK, 10000000 rows affected (41.87 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
    
    MariaDB [l]> INSERT INTO `bigtable30m` SELECT * FROM `bigtable10m`;
    Query OK, 10000000 rows affected (30.87 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
    
    MariaDB [l]>
    MariaDB [l]> CREATE TABLE bigtable_unique (
        ->   `id` varchar(32) NOT NULL DEFAULT '',
        ->  PRIMARY KEY (id)
        -> );
    Query OK, 0 rows affected (0.02 sec)
    
    MariaDB [l]>
    MariaDB [l]> INSERT IGNORE bigtable_unique SELECT * FROM `bigtable30m`;
    Query OK, 10000000 rows affected, 65535 warnings (1 min 57.99 sec)
    Records: 30000000  Duplicates: 20000000  Warnings: 20000000
    
    MariaDB [l]>