LLDB Breakpoints performance - what should I expect?

I've wrote a script that adds a lot of breakpoints to my iOS project. Each breakpoint has a command that calls some logging code and continues without stopping.

During my project execution, those breakpoints are called dozens if not hundred times a second. Unfortunately, app performance collapsed after adding those breakpoints. It's pretty much unresponsive as executing breakpoints slow things down.

My question here is: is that normal? Is performance cost for breakpoints so significant?

I'm pasting below part of my python script from ~/.lldb:

...
for funcName in funcNames:
   breakpointCommand = f'breakpoint set -n {funcName} -f {fileName}'
   lldb.debugger.HandleCommand(breakpointCommand)
   lldb.debugger.HandleCommand('breakpoint command add --script-type python --python-function devTrackerScripts.breakpoint_callback')

def breakpoint_callback(frame, bp_loc, dict):
   lineEntry = frame.GetLineEntry()
   functionName = frame.GetDisplayFunctionName()
   expression = f'expr -- proofLog(lineEntry: "{lineEntry}", function: "{functionName}")'

   lldb.debugger.HandleCommand(expression)

   return False

Solution

It seems that lldb with default settings runs slower than gdb on handling a breakpoint callback after hitting its corresponding breakpoint when debugging locally on Linux (lldb 10.0.0 vs gdb 9.1 on Ubuntu 20.04).

testcase.c

Here is the test case for the performance measurement.

#include <stdio.h>
int count = 0;
int fib(int n) {
    ++count;
    if(n < 2) return n;
    return fib(n - 1) + fib(n - 2);
}
int main() {
    printf("fib(16) = %d\n", fib(16));
    printf("count-1 = %d\n", count - 1);
}

It can be compiled with the command below to get the executable testcase.

clang -g testcase.c -o testcase

m_lldb.py

Create a Python script m_lldb.py with the code below for setting a breakpoint at fib() and a breakpoint callback mbp_callback.

import lldb
import time
s, e = 0, 0
def mbp_callback(frame, bp_loc, dict):
    global s, e
    l, e = e, time.time()
    if s == 0: l, s = e, e
    print("Callback: %9.6fs,  Total: %9.6fs" % (e - l, e - s))
    return False
def __lldb_init_module(debugger, dict):
    tgt = debugger.GetSelectedTarget()
    bp = tgt.BreakpointCreateByName("fib")
    bp.SetScriptCallbackFunction('m_lldb.mbp_callback')

run-lldb.sh

Once the executable testcase and script m_lldb.py are ready, run the bash script below to measure the performance of lldb on handling breakpoint callback.

#!/bin/bash -x
cat << LLDBCMD > lldb.cmd
command script import m_lldb.py
breakpoint list
run
quit
LLDBCMD
lldb -s lldb.cmd -- ./testcase

m_gdb.py

Now let's create a Python script for gdb, which has the same breakpoint callback as the one for lldb.

import gdb
import time
s, e = 0, 0
class MBreakpoint(gdb.Breakpoint):
    def stop(self):
        global s, e
        l, e = e, time.time()
        if s == 0: l, s = e, e
        print("Callback: %9.6fs,  Total: %9.6fs" % (e - l, e - s))
        return False
MBreakpoint("fib")

run-gdb.sh

Let's run the bash script below to measure the performance of gdb on handling breakpoint callback.

#!/bin/bash -x
cat << GDBCMD > gdb.cmd
set pagination off
source m_gdb.py
info breakpoint
run
quit
GDBCMD
gdb -x gdb.cmd --args ./testcase

Performance measurement results for handling breakpoint callback

Server information

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04 LTS
Release:        20.04
Codename:       focal

$ lldb --version
lldb version 10.0.0

$ gdb --version
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1

$ grep -m1 "model name" /proc/cpuinfo
model name      : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz

Results with lldb

... ...
Callback:  0.001098s,  Total:  3.509685s
Callback:  0.001106s,  Total:  3.510791s
Callback:  0.001099s,  Total:  3.511890s
Callback:  0.001097s,  Total:  3.512987s
Callback:  0.001097s,  Total:  3.514084s
Callback:  0.001107s,  Total:  3.515191s
fib(16) = 987
count-1 = 3192
Process 2527525 exited with status = 0 (0x00000000)

Results with gdb

... ...
Callback:  0.000182s,  Total:  0.594779s
Callback:  0.000188s,  Total:  0.594966s
Callback:  0.000182s,  Total:  0.595149s
Callback:  0.000189s,  Total:  0.595337s
Callback:  0.000184s,  Total:  0.595521s
Callback:  0.000187s,  Total:  0.595709s
fib(16) = 987
count-1 = 3192
[Inferior 1 (process 2527714) exited normally]

It shows that lldb is 5.9x slower than gdb to call a breakpoint callback when hitting an associated breakpoint.

The client-server architecture of LLDB seems to cause the poor performance above for debugging a process locally. As described in the doc at https://lldb.llvm.org/use/remote.html, LLDB on Linux and macOS uses the remote debugging stub even when debugging a process locally. Meanwhile heavy threading in lldb seems to be another cause of the poor performance.

Using gdbserver on localhost makes gdb run around 2x slower, but it is still much faster than debugging locally with lldb.

run-gdbsvr.sh

#!/bin/bash -x
(gdbserver --once localhost:2345 ./testcase) &
cat << GDBCMD > gdbsvr.cmd
target remote localhost:2345
set pagination off
source m_gdb.py
info breakpoint
continue
quit
GDBCMD
gdb -x ./gdbsvr.cmd --args ./testcase

Results with gdbserver

... ...
Callback:  0.000384s,  Total:  1.236005s
Callback:  0.000387s,  Total:  1.236392s
Callback:  0.000383s,  Total:  1.236775s
Callback:  0.000386s,  Total:  1.237162s
Callback:  0.000383s,  Total:  1.237545s
Callback:  0.000384s,  Total:  1.237929s
fib(16) = 987
count-1 = 3192

Child exited with status 0

PS: The performance impact of the breakpoint callback should be considered also. Let's measure the impact of the execution of the callback with lldb and gdb.

callback.py

import time
s, e = 0, 0
def stop():
    t = time.time()
    global s, e
    l, e = e, time.time()
    if s == 0: l, s = e, e
    print("Callback: %9.6fs,  Total: %9.6fs" % (e - l, e - s))
    return time.time() - t
for i in range(6):
    print("%9.6f" % stop())

Run callback.py with lldb

(lldb) command script import callback.py
Callback:  0.000000s,  Total:  0.000000s
 0.000034
Callback:  0.000045s,  Total:  0.000045s
 0.000009
Callback:  0.000015s,  Total:  0.000060s
 0.000007
Callback:  0.000013s,  Total:  0.000073s
 0.000007
Callback:  0.000013s,  Total:  0.000086s
 0.000007
Callback:  0.000013s,  Total:  0.000099s
 0.000007
(lldb)

Run callback.py with gdb

(gdb) source callback.py
Callback:  0.000000s,  Total:  0.000000s
 0.000023
Callback:  0.000033s,  Total:  0.000033s
 0.000010
Callback:  0.000018s,  Total:  0.000051s
 0.000010
Callback:  0.000017s,  Total:  0.000068s
 0.000009
Callback:  0.000017s,  Total:  0.000086s
 0.000009
Callback:  0.000018s,  Total:  0.000103s
 0.000009
(gdb)

It takes about 7~9us for each execution of the breakpoint callback, and runs a little bit faster on lldb. The performance impact of the execution of the breakpoint callback is very limitted, ~4.8% for gdb and ~0.6% for lldb.