Search code examples
debuggingclangllvmlldb

LLDB Breakpoints performance - what should I expect?


I've wrote a script that adds a lot of breakpoints to my iOS project. Each breakpoint has a command that calls some logging code and continues without stopping.

During my project execution, those breakpoints are called dozens if not hundred times a second. Unfortunately, app performance collapsed after adding those breakpoints. It's pretty much unresponsive as executing breakpoints slow things down.

My question here is: is that normal? Is performance cost for breakpoints so significant?

I'm pasting below part of my python script from ~/.lldb:

...
for funcName in funcNames:
   breakpointCommand = f'breakpoint set -n {funcName} -f {fileName}'
   lldb.debugger.HandleCommand(breakpointCommand)
   lldb.debugger.HandleCommand('breakpoint command add --script-type python --python-function devTrackerScripts.breakpoint_callback')
def breakpoint_callback(frame, bp_loc, dict):
   lineEntry = frame.GetLineEntry()
   functionName = frame.GetDisplayFunctionName()
   expression = f'expr -- proofLog(lineEntry: "{lineEntry}", function: "{functionName}")'

   lldb.debugger.HandleCommand(expression)

   return False


Solution

  • It seems that lldb with default settings runs slower than gdb on handling a breakpoint callback after hitting its corresponding breakpoint when debugging locally on Linux (lldb 10.0.0 vs gdb 9.1 on Ubuntu 20.04).

    testcase.c

    Here is the test case for the performance measurement.

    #include <stdio.h>
    int count = 0;
    int fib(int n) {
        ++count;
        if(n < 2) return n;
        return fib(n - 1) + fib(n - 2);
    }
    int main() {
        printf("fib(16) = %d\n", fib(16));
        printf("count-1 = %d\n", count - 1);
    }
    

    It can be compiled with the command below to get the executable testcase.

    clang -g testcase.c -o testcase
    

    m_lldb.py

    Create a Python script m_lldb.py with the code below for setting a breakpoint at fib() and a breakpoint callback mbp_callback.

    import lldb
    import time
    s, e = 0, 0
    def mbp_callback(frame, bp_loc, dict):
        global s, e
        l, e = e, time.time()
        if s == 0: l, s = e, e
        print("Callback: %9.6fs,  Total: %9.6fs" % (e - l, e - s))
        return False
    def __lldb_init_module(debugger, dict):
        tgt = debugger.GetSelectedTarget()
        bp = tgt.BreakpointCreateByName("fib")
        bp.SetScriptCallbackFunction('m_lldb.mbp_callback')
    

    run-lldb.sh

    Once the executable testcase and script m_lldb.py are ready, run the bash script below to measure the performance of lldb on handling breakpoint callback.

    #!/bin/bash -x
    cat << LLDBCMD > lldb.cmd
    command script import m_lldb.py
    breakpoint list
    run
    quit
    LLDBCMD
    lldb -s lldb.cmd -- ./testcase
    

    m_gdb.py

    Now let's create a Python script for gdb, which has the same breakpoint callback as the one for lldb.

    import gdb
    import time
    s, e = 0, 0
    class MBreakpoint(gdb.Breakpoint):
        def stop(self):
            global s, e
            l, e = e, time.time()
            if s == 0: l, s = e, e
            print("Callback: %9.6fs,  Total: %9.6fs" % (e - l, e - s))
            return False
    MBreakpoint("fib")
    

    run-gdb.sh

    Let's run the bash script below to measure the performance of gdb on handling breakpoint callback.

    #!/bin/bash -x
    cat << GDBCMD > gdb.cmd
    set pagination off
    source m_gdb.py
    info breakpoint
    run
    quit
    GDBCMD
    gdb -x gdb.cmd --args ./testcase
    

    Performance measurement results for handling breakpoint callback

    Server information

    $ lsb_release -a
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description:    Ubuntu 20.04 LTS
    Release:        20.04
    Codename:       focal
    
    $ lldb --version
    lldb version 10.0.0
    
    $ gdb --version
    GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1
    
    $ grep -m1 "model name" /proc/cpuinfo
    model name      : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
    

    Results with lldb

    ... ...
    Callback:  0.001098s,  Total:  3.509685s
    Callback:  0.001106s,  Total:  3.510791s
    Callback:  0.001099s,  Total:  3.511890s
    Callback:  0.001097s,  Total:  3.512987s
    Callback:  0.001097s,  Total:  3.514084s
    Callback:  0.001107s,  Total:  3.515191s
    fib(16) = 987
    count-1 = 3192
    Process 2527525 exited with status = 0 (0x00000000)
    

    Results with gdb

    ... ...
    Callback:  0.000182s,  Total:  0.594779s
    Callback:  0.000188s,  Total:  0.594966s
    Callback:  0.000182s,  Total:  0.595149s
    Callback:  0.000189s,  Total:  0.595337s
    Callback:  0.000184s,  Total:  0.595521s
    Callback:  0.000187s,  Total:  0.595709s
    fib(16) = 987
    count-1 = 3192
    [Inferior 1 (process 2527714) exited normally]
    

    It shows that lldb is 5.9x slower than gdb to call a breakpoint callback when hitting an associated breakpoint.

    The client-server architecture of LLDB seems to cause the poor performance above for debugging a process locally. As described in the doc at https://lldb.llvm.org/use/remote.html, LLDB on Linux and macOS uses the remote debugging stub even when debugging a process locally. Meanwhile heavy threading in lldb seems to be another cause of the poor performance.

    Using gdbserver on localhost makes gdb run around 2x slower, but it is still much faster than debugging locally with lldb.

    run-gdbsvr.sh

    #!/bin/bash -x
    (gdbserver --once localhost:2345 ./testcase) &
    cat << GDBCMD > gdbsvr.cmd
    target remote localhost:2345
    set pagination off
    source m_gdb.py
    info breakpoint
    continue
    quit
    GDBCMD
    gdb -x ./gdbsvr.cmd --args ./testcase
    

    Results with gdbserver

    ... ...
    Callback:  0.000384s,  Total:  1.236005s
    Callback:  0.000387s,  Total:  1.236392s
    Callback:  0.000383s,  Total:  1.236775s
    Callback:  0.000386s,  Total:  1.237162s
    Callback:  0.000383s,  Total:  1.237545s
    Callback:  0.000384s,  Total:  1.237929s
    fib(16) = 987
    count-1 = 3192
    
    Child exited with status 0
    

    PS: The performance impact of the breakpoint callback should be considered also. Let's measure the impact of the execution of the callback with lldb and gdb.

    callback.py

    import time
    s, e = 0, 0
    def stop():
        t = time.time()
        global s, e
        l, e = e, time.time()
        if s == 0: l, s = e, e
        print("Callback: %9.6fs,  Total: %9.6fs" % (e - l, e - s))
        return time.time() - t
    for i in range(6):
        print("%9.6f" % stop())
    

    Run callback.py with lldb

    (lldb) command script import callback.py
    Callback:  0.000000s,  Total:  0.000000s
     0.000034
    Callback:  0.000045s,  Total:  0.000045s
     0.000009
    Callback:  0.000015s,  Total:  0.000060s
     0.000007
    Callback:  0.000013s,  Total:  0.000073s
     0.000007
    Callback:  0.000013s,  Total:  0.000086s
     0.000007
    Callback:  0.000013s,  Total:  0.000099s
     0.000007
    (lldb)
    

    Run callback.py with gdb

    (gdb) source callback.py
    Callback:  0.000000s,  Total:  0.000000s
     0.000023
    Callback:  0.000033s,  Total:  0.000033s
     0.000010
    Callback:  0.000018s,  Total:  0.000051s
     0.000010
    Callback:  0.000017s,  Total:  0.000068s
     0.000009
    Callback:  0.000017s,  Total:  0.000086s
     0.000009
    Callback:  0.000018s,  Total:  0.000103s
     0.000009
    (gdb)
    

    It takes about 7~9us for each execution of the breakpoint callback, and runs a little bit faster on lldb. The performance impact of the execution of the breakpoint callback is very limitted, ~4.8% for gdb and ~0.6% for lldb.