Search code examples
mysqlsqlperformanceinsert-select

How to speed up INSERT SELECT with join over string


I want to call insert select and I try to use this select (with help from this INSERT SELECT query when one column is unique)

SELECT minids.userid, username, password, full_name, country, email,
      (select openclipart_files.id from aiki_users, openclipart_files
       where aiki_users.userid = users.userid and avatar like '%svg' AND
             filename = avatar) as avatar,
      homepage, usergroup, notify, nsfwfilter
FROM aiki_users users
INNER JOIN (SELECT MIN(userid) as userid FROM aiki_users GROUP by username) 
           minids ON minids.userid = users.userid;

I use it with insert

INSERT INTO openclipart_users(id, user_name, password, full_name, country,
email, avatar, homepage, user_group, notify, nsfw_filter) SELECT ...

It take very long to execute (I cancel it after few minutes)

aiki_users table have 100k rows and openclipart_files have 30k rows, basically I just copy all stuff from aiki_users to openclipart_users skipping duplicates and I want avatar to be id of the file where in old table is a string (only about 300 users have svg avatars about 1k users have non '' avatar but I only what svg).

IS there any way to quickly insert that avatar (without it INSERT SELECT execute in few seconds) into openclipart_users, any solustion that will work will be fine (by quickly I mean less then a minute).

EDIT output of explain over SELECT

+----+--------------------+-------------------+--------+---------------+----------+---------+----------------------------------+--------+-------------+
| id | select_type        | table             | type   | possible_keys | key      | key_len | ref                              | rows   | Extra       |
+----+--------------------+-------------------+--------+---------------+----------+---------+----------------------------------+--------+-------------+
|  1 | PRIMARY            | <derived3>        | ALL    | NULL          | NULL     | NULL    | NULL                             | 106689 |             |
|  1 | PRIMARY            | users             | eq_ref | PRIMARY       | PRIMARY  | 4       | minids.userid                    |      1 |             |
|  3 | DERIVED            | aiki_users        | index  | NULL          | username | 302     | NULL                             | 111273 | Using index |
|  2 | DEPENDENT SUBQUERY | openclipart_files | ALL    | NULL          | NULL     | NULL    | NULL                             |  37715 |             |
|  2 | DEPENDENT SUBQUERY | aiki_users        | eq_ref | PRIMARY       | PRIMARY  | 4       | openclipart_staging.users.userid |      1 | Using where |
+----+--------------------+-------------------+--------+---------------+----------+---------+----------------------------------+--------+-------------+

Solution

  • Converting to join only syntax (getting rid of the correlated subquery and joining to a subselect instead):

    SELECT minids.userid, username, password, full_name, country, email,
          clip.id as avatar,
          homepage, usergroup, notify, nsfwfilter
    FROM aiki_users users
    INNER JOIN (SELECT MIN(userid) as userid FROM aiki_users GROUP by username) 
               minids ON minids.userid = users.userid
    LEFT OUTER JOIN openclipart_files clip ON 
               clip.owner = users.userid AND RIGHT(users.avatar, 3) = 'svg' 
               AND clip.filename = users.avatar
    

    Try that out.