Search code examples
mysqlselectoptimizationinnodb

Huge innodb tables with SELECT performance issue


I have two huge innodb tables (page: +40M rows, +30Gb and stat: +45M rows, +10Gb). I have a query that selects rows from the join of these two tables and it used to take about a second for execution. Recently it's taking more than 20 seconds (sometime up to few minutes) for the exact same query to be completed. I suspected that with lot's of inserts and updates it might need an optimization. I ran OPTIMIZE TABLE on the table using phpMyAdmin but no improvements. I've Googled a lot but couldn't find any content helping me on this situation.

The query I mentioned earlier looks like below:

SELECT `c`.`unique`, `c`.`pub`
    FROM `pages` `c`
    LEFT JOIN `stat` `s` ON `c`.`unique`=`s`.`unique`
    WHERE `s`.`isc`='1'
      AND `s`.`haa`='0'
      AND (`pubID`='24')
    ORDER BY `eid` ASC LIMIT 0, 10

These are the tables structure:

CREATE TABLE `pages` (
  `eid` int(10) UNSIGNED NOT NULL,
  `ti` text COLLATE utf8_persian_ci NOT NULL,
  `fat` text COLLATE utf8_persian_ci NOT NULL,
  `de` text COLLATE utf8_persian_ci NOT NULL,
  `fad` text COLLATE utf8_persian_ci NOT NULL,
  `pub` varchar(100) COLLATE utf8_persian_ci NOT NULL,
  `pubID` int(10) UNSIGNED NOT NULL,
  `pubn` text COLLATE utf8_persian_ci NOT NULL,
  `unique` tinytext COLLATE utf8_persian_ci NOT NULL,
  `pi` tinytext COLLATE utf8_persian_ci NOT NULL,
  `kw` text COLLATE utf8_persian_ci NOT NULL,
  `fak` text COLLATE utf8_persian_ci NOT NULL,
  `te` text COLLATE utf8_persian_ci NOT NULL,
  `fae` text COLLATE utf8_persian_ci NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci;
ALTER TABLE `pages`
  ADD PRIMARY KEY (`eid`),
  ADD UNIQUE KEY `UNIQ` (`unique`(128)),
  ADD KEY `pub` (`pub`),
  ADD KEY `unique` (`unique`(128)),
  ADD KEY `pubID` (`pubID`) USING BTREE;
ALTER TABLE `pages` ADD FULLTEXT KEY `faT` (`fat`);
ALTER TABLE `pages` ADD FULLTEXT KEY `faA` (`fad`,`fae`);
ALTER TABLE `pages` ADD FULLTEXT KEY `faK` (`fak`);
ALTER TABLE `pages` ADD FULLTEXT KEY `pubn` (`pubn`);
ALTER TABLE `pages` ADD FULLTEXT KEY `faTAK` (`fat`,`fad`,`fak`,`fae`);
ALTER TABLE `pages` ADD FULLTEXT KEY `ab` (`de`,`te`);
ALTER TABLE `pages` ADD FULLTEXT KEY `Ti` (`ti`);
ALTER TABLE `pages` ADD FULLTEXT KEY `Kw` (`kw`);
ALTER TABLE `pages` ADD FULLTEXT KEY `TAK` (`ti`,`de`,`kw`,`te`);
ALTER TABLE `pages`
  MODIFY `eid` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;


CREATE TABLE `stat` (
  `sid` int(10) UNSIGNED NOT NULL,
  `unique` tinytext COLLATE utf8_persian_ci NOT NULL,
  `haa` tinyint(1) UNSIGNED NOT NULL,
  `isc` tinyint(1) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci;
ALTER TABLE `stat`
  ADD PRIMARY KEY (`sid`),
  ADD UNIQUE KEY `Unique` (`unique`(128)),
  ADD KEY `isc` (`isc`),
  ADD KEY `haa` (`haa`),
ALTER TABLE `stat`
  MODIFY `sid` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;

The following query took only 0.0126 seconds with 38685601 total results as said by phpMyAdmin:

SELECT `sid` FROM `stat` WHERE `s`.`isc`='1' AND `s`.`haa`='0'

and this one took 0.0005 seconds with 5159484 total results

SELECT `eid`, `unique`, `pubn`, `pi` FROM `pages` WHERE `pubID`='24'

Am I missing something? Can anybody help?


Solution

  • The slowdown is probably due to scanning so many rows, and that is now more than can fit in cache. So, let's try to improve the query.

    • Replace INDEX(pubID) with INDEX(pubID, eid) -- This may allow both the WHERE and ORDER BY to be handled by the index, thereby avoiding a sort.
    • Replace TINYTEXT with VARCHAR(255) or some smaller limit. This may speed up tmp tables.
    • Don't use prefix index on eid -- its an INT !
    • Don't say UNIQUE with prefixing -- UNIQUE(x(128)) only checks the uniqueness of the first 128 columns !
    • Once you change to VARCHAR(255) (or less), you can apply UNIQUE to the entire column.
    • The biggest performance issue is filtering on two tables -- can you move the status flags into the main table?
    • Change LEFT JOIN to JOIN.
    • What does unique look like? If it is a "UUID", that could further explain the trouble.
    • If that is a UUID that is 39 characters, the string can be converted to a 16-byte column for further space savings (and speedup). Let's discuss this further if necessary.

    5 million results in 0.5ms is bogus -- it was fetching from the Query cache. Either turn off the QC or run with SELECT SQL_NO_CACHE...