Search code examples
performanceapache-sparkapache-kududata-ingestion

Apache Kudu slow insert, high queuing time


I have been using Spark Data Source to write to Kudu from Parquet, and the write performance is terrible: about 12000 rows / seconds. Each row roughly 160 bytes.

We have 7 kudu nodes, 24 core + 64 GB RAM each + 12 SATA disk each. None of the resources seem to be the bottleneck: tserver cpu usage ~3-4 core, RAM 10G, no disk congestion.

Still I see most of the time write requests were stuck in queuing. Any ideas are appreciated.

W0811 12:34:03.526340  7753 rpcz_store.cc:251] Call kudu.tserver.TabletServerService.Write from 10.60.170.18:10000 (ReqId={client: 81ae6f3c6e1b4d9493ea95f87ccd1dfa, seq_no=9365, attempt_no=1}) took 13255ms (client timeout 10000).
W0811 12:34:03.526489  7753 rpcz_store.cc:255] Trace:
0811 12:33:50.270477 (+     0us) service_pool.cc:163] Inserting onto call queue
0811 12:33:50.270497 (+    20us) service_pool.cc:222] Handling call
0811 12:34:03.526316 (+13255819us) inbound_call.cc:157] Queueing success response
Related trace 'txn':
0811 12:34:03.328337 (+     0us) write_transaction.cc:101] PREPARE: Starting
0811 12:34:03.328563 (+   226us) write_transaction.cc:268] Acquiring schema lock in shared mode
0811 12:34:03.328564 (+     1us) write_transaction.cc:271] Acquired schema lock
0811 12:34:03.328564 (+     0us) tablet.cc:400] PREPARE: Decoding operations
0811 12:34:03.328742 (+   178us) tablet.cc:422] PREPARE: Acquiring locks for 24 operations
0811 12:34:03.447163 (+118421us) lock_manager.cc:377] Waited 118408us for lock on <redacted>
0811 12:34:03.447203 (+    40us) tablet.cc:426] PREPARE: locks acquired
0811 12:34:03.447203 (+     0us) write_transaction.cc:126] PREPARE: finished.
0811 12:34:03.447361 (+   158us) write_transaction.cc:136] Start()
0811 12:34:03.447366 (+     5us) write_transaction.cc:141] Timestamp: P: 1533965643563964 usec, L: 6
0811 12:34:03.447674 (+   308us) log.cc:582] Serialized 64909 byte log entry
0811 12:34:03.449561 (+  1887us) write_transaction.cc:149] APPLY: Starting
0811 12:34:03.526238 (+ 76677us) tablet_metrics.cc:365] ProbeStats: bloom_lookups=48,key_file_lookups=48,delta_file_lookups=24,mrs_lookups=0
0811 12:34:03.526260 (+    22us) log.cc:582] Serialized 237 byte log entry
0811 12:34:03.526268 (+     8us) write_transaction.cc:309] Releasing row and schema locks
0811 12:34:03.526280 (+    12us) write_transaction.cc:277] Released schema lock
0811 12:34:03.526300 (+    20us) write_transaction.cc:196] FINISH: updating metrics
Metrics: {"child_traces":[["txn",{"apply.queue_time_us":11,"cfile_cache_hit":205,"cfile_cache_hit_bytes":21900627,"num_ops":24,"prepare.queue_time_us":13057291,"prepare.run_cpu_time_us":1017,"prepare.run_wall_time_us":119378,"raft.queue_time_us":71,"raft.run_cpu_time_us":303,"raft.run_wall_time_us":304,"replication_time_us":2170,"row_lock_wait_count":1,"row_lock_wait_us":118408,"spinlock_wait_cycles":45824}]]}

Solution

  • Turn out it's due to the duplication in our data. We are using a field which contains about 1.2 million rows with the same value (which is and empty string) as the primary key in Kudu. So Kudu updated that same key 1.2 million times, and each time it needed to acquire a lock, hence the drop in ingesting speed over time.

    We have removed the duplicated key rows and the ingesting speed increase to 10x.