Search code examples
mysqlmysqlimport

Importing a massive 70GB table in MySQL, table remains unavailable for a long time after mysqlimport completes


Ok, the obvious answer to the subject line is: duh! It's 70GB, be patient. But hear me out.

I used mysqlimport to quickly load the data file (mysql standard tab delimited format). It took around 4 hours for the mysqlimport command to complete. During this time the disk was running at capacity (~50MB/sec disk activity)

But It's been another 5 hours since it completed and I still can't run a simple query such as

 select * from my_big_table limit 1

When I run that query it just blocks (seemingly) indefinitely. With no queries running I see disk activity (one mysql process running 3MB/sec read & write, and another doing a few MB/sec of read activity), that's been constant since I ran mysqlimport. I also see what looks like one mysql thread running near 100% since the mysqlimport as well.

My guess is that mysql is processing the indexing and such in the background (150M records, InnoDB) and that's what's holding up my ability to query the table. But I don't know how to verify that or see what activity is going on. Or make any estimate at when it might finish.


Superfluous detail:

I'm sure someone will question why a 70GB table is going into mysql. It's a read-only data table used in a web app. It is indexed only on the id and queries will only be on a limited set of IDs (no range queries over this table), just joins on the id column and direct queries on the id column. The table is updated in a large Hadoop job and will be updated nightly with mysqlimport for efficiency.


UPDATE: MySQL ultimately crashed after the 70GB import, I'm not seeing much in the error log. I've changed the table to the myisam engine and am attempting the import again.

Error Log:

Version: '5.5.28-0ubuntu0.12.04.2-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu)
121209 18:41:07 [Warning] IP address '10.0.4.160' could not be resolved: Name or service not known
121217  1:54:16 [Note] Plugin 'FEDERATED' is disabled.
121217  1:54:16 InnoDB: The InnoDB memory heap is disabled
121217  1:54:16 InnoDB: Mutexes and rw_locks use GCC atomic builtins
121217  1:54:16 InnoDB: Compressed tables use zlib 1.2.3.4
121217  1:54:16 InnoDB: Initializing buffer pool, size = 11.7G
121217  1:54:18 InnoDB: Completed initialization of buffer pool
121217  1:54:18 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 217624804600
121217  1:54:18  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 217630047232
InnoDB: Doing recovery: scanned up to log sequence number 217635290112
InnoDB: Doing recovery: scanned up to log sequence number 217640532992
InnoDB: Doing recovery: scanned up to log sequence number 217645775872
InnoDB: Doing recovery: scanned up to log sequence number 217651018752
InnoDB: Doing recovery: scanned up to log sequence number 217656261632
InnoDB: Doing recovery: scanned up to log sequence number 217661504512
InnoDB: Doing recovery: scanned up to log sequence number 217666747392
InnoDB: Doing recovery: scanned up to log sequence number 217671990272
InnoDB: Doing recovery: scanned up to log sequence number 217677233152
InnoDB: Doing recovery: scanned up to log sequence number 217682476032
InnoDB: Doing recovery: scanned up to log sequence number 217687718912
InnoDB: Doing recovery: scanned up to log sequence number 217692961792
InnoDB: Doing recovery: scanned up to log sequence number 217698204672
InnoDB: Doing recovery: scanned up to log sequence number 217703447552
InnoDB: Doing recovery: scanned up to log sequence number 217708690432
InnoDB: Doing recovery: scanned up to log sequence number 217713933312
InnoDB: Doing recovery: scanned up to log sequence number 217719176192
InnoDB: Doing recovery: scanned up to log sequence number 217724419072
InnoDB: Doing recovery: scanned up to log sequence number 217729661952
InnoDB: Doing recovery: scanned up to log sequence number 217734904832
InnoDB: Doing recovery: scanned up to log sequence number 217740147712
InnoDB: Doing recovery: scanned up to log sequence number 217745390592
InnoDB: Doing recovery: scanned up to log sequence number 217750633472
InnoDB: Doing recovery: scanned up to log sequence number 217755876352
InnoDB: Doing recovery: scanned up to log sequence number 217761119232
InnoDB: Doing recovery: scanned up to log sequence number 217766362112
InnoDB: Doing recovery: scanned up to log sequence number 217771604992
InnoDB: Doing recovery: scanned up to log sequence number 217776847872
InnoDB: Doing recovery: scanned up to log sequence number 217782090752
InnoDB: Doing recovery: scanned up to log sequence number 217787333632
InnoDB: Doing recovery: scanned up to log sequence number 217792576512
InnoDB: Doing recovery: scanned up to log sequence number 217797819392
InnoDB: Doing recovery: scanned up to log sequence number 217803062272
InnoDB: Doing recovery: scanned up to log sequence number 217808305152
InnoDB: Doing recovery: scanned up to log sequence number 217813548032
InnoDB: Doing recovery: scanned up to log sequence number 217818790912
InnoDB: Doing recovery: scanned up to log sequence number 217824033792
InnoDB: Doing recovery: scanned up to log sequence number 217829276672
InnoDB: Doing recovery: scanned up to log sequence number 217834519552
InnoDB: Doing recovery: scanned up to log sequence number 217839762432
InnoDB: Doing recovery: scanned up to log sequence number 217845005312
InnoDB: Doing recovery: scanned up to log sequence number 217850248192
InnoDB: Doing recovery: scanned up to log sequence number 217855491072
InnoDB: Doing recovery: scanned up to log sequence number 217860733952
InnoDB: Doing recovery: scanned up to log sequence number 217865976832
InnoDB: Doing recovery: scanned up to log sequence number 217871219712
InnoDB: Doing recovery: scanned up to log sequence number 217876462592
InnoDB: Doing recovery: scanned up to log sequence number 217881705472
InnoDB: Doing recovery: scanned up to log sequence number 217886948352
InnoDB: Doing recovery: scanned up to log sequence number 217892191232
InnoDB: Doing recovery: scanned up to log sequence number 217897434112
InnoDB: Doing recovery: scanned up to log sequence number 217902676992
InnoDB: Doing recovery: scanned up to log sequence number 217907919872
InnoDB: Doing recovery: scanned up to log sequence number 217913162752
InnoDB: Doing recovery: scanned up to log sequence number 217918405632
InnoDB: Doing recovery: scanned up to log sequence number 217923648512
InnoDB: Doing recovery: scanned up to log sequence number 217928891392
InnoDB: Doing recovery: scanned up to log sequence number 217934134272
InnoDB: Doing recovery: scanned up to log sequence number 217939377152
InnoDB: Doing recovery: scanned up to log sequence number 217944620032
InnoDB: Doing recovery: scanned up to log sequence number 217949862912
InnoDB: Doing recovery: scanned up to log sequence number 217955105792
InnoDB: Doing recovery: scanned up to log sequence number 217960348672
InnoDB: Doing recovery: scanned up to log sequence number 217965591552
InnoDB: Doing recovery: scanned up to log sequence number 217970834432
InnoDB: Doing recovery: scanned up to log sequence number 217976077312
InnoDB: Doing recovery: scanned up to log sequence number 217981320192
InnoDB: Doing recovery: scanned up to log sequence number 217986563072
InnoDB: Doing recovery: scanned up to log sequence number 217991805952
InnoDB: Doing recovery: scanned up to log sequence number 217997048832
InnoDB: Doing recovery: scanned up to log sequence number 218002291712
InnoDB: Doing recovery: scanned up to log sequence number 218007534592
InnoDB: Doing recovery: scanned up to log sequence number 218012777472
InnoDB: Doing recovery: scanned up to log sequence number 218018020352
InnoDB: Doing recovery: scanned up to log sequence number 218023263232
InnoDB: Doing recovery: scanned up to log sequence number 218028506112
InnoDB: Doing recovery: scanned up to log sequence number 218033748992
InnoDB: Doing recovery: scanned up to log sequence number 218038991872
InnoDB: Doing recovery: scanned up to log sequence number 218044234752
InnoDB: Doing recovery: scanned up to log sequence number 218047052128
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 17071570 row operations to undo
InnoDB: Trx id counter is 3D0D200
121217  1:54:32  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 1211503, file name ./frugg-bin-backups.000389
InnoDB: Starting in background the rollback of uncommitted transactions
121217  2:01:34  InnoDB: Rolling back trx with id 3D02544, 17071570 rows to undo

InnoDB: Progress in percents: 1121217  2:01:34  InnoDB: Waiting for the background threads to start
121217  2:01:35 InnoDB: 1.1.8 started; log sequence number 218047052128
121217  2:01:35 [Note] Recovering after a crash using frugg-bin-backups
121217  2:01:35 [Note] Starting crash recovery...
121217  2:01:35 [Note] Crash recovery finished.
121217  2:01:35 [Note] Server hostname (bind-address): '10.0.1.227'; port: 3306
121217  2:01:35 [Note]   - '10.0.1.227' resolves to '10.0.1.227';
121217  2:01:35 [Note] Server socket created on IP: '10.0.1.227'.
121217  2:01:35 [Note] Event Scheduler: Loaded 0 events
121217  2:01:35 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.28-0ubuntu0.12.04.2-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu)
121217  2:01:38 [Warning] IP address '10.0.4.160' could not be resolved: Name or service not known
 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
InnoDB: Rolling back of trx id 3D02544 completed
121217  9:36:40  InnoDB: Rollback of non-prepared transactions completed
121217 16:12:45 [Note] /usr/sbin/mysqld: Normal shutdown

121217 16:12:45 [Note] Event Scheduler: Purging the queue. 0 events
121217 16:12:45  InnoDB: Starting shutdown...
121217 16:12:54  InnoDB: Shutdown completed; log sequence number 230916413935
121217 16:12:54 [Note] /usr/sbin/mysqld: Shutdown complete

Solution

  • After switching to MyISAM I was able to load the table more quickly. It loaded the table, and then took another many hours after the load before I could run queries. This was primarily due to having binary logging enabled, it was writing the 70GB file twice, once to the table, and again to the binary logs.

    I also enabled:

    [mysqld]
    innodb_file_per_table
    

    That makes it a lot easier to tell what's going on with innodb.