Search code examples
sqlitetcl

Clock clicks and unixepoch('now','subsec')


I'm trying to log the time that it takes for certain procedures to process; but not really what might be considered performance testing in terms of comparing different code methods to generate the same output and determine which is most efficient.

I just need to add some intervals to a log that is built as the user interacts with the application. Right now, there are two main items I'd like to know involving time. One is I'm logging whether or not a search was needed to fulfill the request and, if so, how many steps were required to locate the items(s). I'd like to know how long it took to perform the search. Then, once the search is done, I'd like to kow how long it took to perform the operations on that information to fulfill the request.

It appears that clock clicks in Tcl might be the correct value to mark the starting point and points along the way. Much of this work involves writing data to an in-memory SQLite database, and the unixepoch('now','subsec') could be used also.

Is one preferrable to the other, or is there a better approach altogether?

If it is relevant, one of my objectives (if possible) is to log the requests in order to run them back again and construct the same log and compare if changes to the code make any difference in whether a search was required, the number of steps in a search, and time, etcetera.

Thank you for considering my question.


Solution

  • This is what I use for the type of thing you seem to want:

    proc log_duration {msg script} {
        set cmds        [info cmdcount]
        set before      [clock microseconds]
        set code        [catch {uplevel 1 $script} res options]
        set elapsed     [expr {([clock microseconds] - $before) / 1e6}]
        set cmdcount    [expr {[info cmdcount] - $cmds - 12}]   ;# -12: compensate for measurement overhead
        set codename    [switch -- $code {
            0       {string cat OK}
            1       {string cat ERROR}
            2       {string cat RETURN}
            3       {string cat BREAK}
            4       {string cat CONTINUE}
            default {set code}
        }]
        puts "$msg, $codename: [format %.6f $elapsed] seconds, $cmdcount Tcl commands"
        dict incr options -level 1
        return -options $options $res
    }
    
    log_duration "Lookup the user details" {
        db eval {select id from users where name=:username}
    }
    

    I use clock microseconds rather than clock clicks for a few reasons:

    • The resolution of clock clicks is unknown to the script, so reporting meaningful durations is difficult. It's also not relative to any specified point in time, meaning it can't be formatted in a human meaningful way.
    • On the fastest hardware relevant to me, the smallest time that can be taken by a single bytecoded Tcl command is on the order of 0.1 microseconds, and reliably measuring anything close to that limit is dominated by other effects like processor frequency scaling (boosting and throttling to meet thermal budgets), SMT and asymmetric cores scheduling, cache state, context switches, and a whole host of other effects that make meaningful measurement of small intervals very challenging.
    • In situations where multiple hosts are involved, there is at least a hope that the absolute values returned by clock microseconds are comparable (at least to the degree that the platform's time synchronisation can achieve).
    • The idea of using a CPU cycle count for measurement (or whatever else clock clicks happens to have available) made more sense in the days before frequency scaling, sleep states, multiple cores and core migration, out-of-order execution, deep pipelines, etc.

    For measurement of sub-sections of the measured script, I usually just use nested log_duration calls. In situations where I'm moving probes around inside the measured section, or the overhead of log_duration itself becomes a problem, then I usually do something like:

    proc ts label {
        global _timestamps
        lappend _timestamps [clock microseconds] $label
    }
    
    set ::_timestamps   {}
    set start           [clock microseconds]
    
    # ... some commands ...
    ts foo
    # ... some others ...
    ts bar
    # ... still more ...
    ts baz
    # ... you get the point ...
    
    set last    $start
    puts [join [lmap {ts label} $::_timestamps {
        try {
            format {%6d, %6d %s} [expr {$ts-$start}] [expr {$ts-$last}] $label
        } finally {
            set last   $ts
        }
    }] \n]
    

    The slightly awkward construction of a single puts on the lmap is because I usually use this approach when one of the things I'm optimising for is minimising log calls - typically this would be creating a single entry in, say, AWS CloudWatch containing the whole timing report.

    Meeting your requirement for replaying and comparing the measurement results will depend on recording the input data that is relevant to the run which is of course outside of the scope of the measurement code, other than to have the output of the measurement framework suitable for making that comparison. clock microseconds does this by measuring a consistent thing, and these examples output relative quantities which should be directly comparable between runs.

    If these simple approaches don't quite cut it and you need things like centralised visualisation of timestamped events in a network-distributed setting, then you might find some useful bits in the evlog package I wrote, uh, some time ago. It's probably not that difficult to get running again (Tcl is fantastic in that regard) and about 13 years ago I converted it from Itcl / Tk to TclOO, but I know it uses blt's table geometry manager which is long since defunct. I split out that component of the blt package and published it somewhere as a stand-alone package with some fixes for newer Tk, but I can't recall where (probably Sourceforge). It's likely easier just converting the UI to use grid than hunting it down. But it may also want you to pull in the whole m2 framework, which is probably more work than it's worth, but it might be worth raiding the source for ideas.