Search code examples
pythonregexparsingdbus

Parsing dbus monitor output messages


I am trying to parse the dbus monitor output messages. It has most of the messages as multi-line entries(including parameters). I need to parse and concatenate individual log messages to a single line entry.

The dbus-monitor output messages appear as below,

method call time=462.117843 sender=:1.62 -> destination=org.freedesktop.filehandler serial=122 path=/org/freedesktop/filehandler/routing; interface=org.freedesktop.filehandler.routing; member=start
int16 29877
uint16 0
method return time=462.117844 sender=org.freedesktop.filehandler -> destination=:1.62 serial=2210 reply_serial=122
int16 29877
uint16 0
method call time=462.117845 sender=:1.62 -> destination=org.freedesktop.filehandler serial=123 path=/org/freedesktop/filehandler/routing; interface=org.freedesktop.filehandler.routing; member=comment
string "starting .."
string "routing"
method return time=462.117846 sender=:1.19 -> destination=:1.62 serial=2212 reply_serial=123
int12 -23145
signal time=463.11223 sender=:1.64 -> destination=(null destination) serial=124 path=/org/freedesktop/fileserver; interface=org.freedesktop.DBus.Properties; member=PropertiesChanged
  string "com.freedesktop.Systemserver"
  array[
    dict entry(
      string "SystemTime"
      variant       struct{
            byte 12
            byte 9
            byte 0
        }
    )
  ]
  array [
  ]

This is the regex I tried to group the dbus messages(Parameter not grouped),

\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?

I expect the output in the below format,

C [sender,serial] path interface+member (parameter1, parameter2, ...)
R [destination,reply_serial] interface+member (parameter1, parameter2, ...)
S [sender, serial] path interface+member (parameter1, parameter2, ...)

A sample output for the above dbus-monitor messages is shown below,

C [:1.62,122] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.start (29877,0)
R [:1.62,122] org.freedesktop.filehandler.routing.start (29877,0)
C [:1.62,123] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.comment ("starting", "routing")
R [:1.62,123] org.freedesktop.filehandler.routing.comment (-23145)
S [:1.64, 124] /org/freedesktop/fileserver org.freedesktop.DBus.Properties.PropertiesChanged ("com.freedesktop.Systemserver"[("SystemTime",{12,9,0})][])

How can the above expected result be achieved when the entries are usually multi-line? Also, the SIGNALS has multiple encapsulations making it difficult to access the parameters. Can someone help with the parsing of these dbus messages to the expected format?


Solution

  • Can you suggest how the code can be rewritten to process line by line?

    Here I rearranged it accordingly:

    import re
    import sys
    regex = r'\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?'
    remember = dict()
    sep = None
    for line in open('dbusl.in'):
        m = re.match(regex, line)
        if m:
            if sep is not None: print ")"   # end the previous parameter group
            m = list(m.groups())        # each match is 9 capturing groups
            if m[0] == 'method call':
                print "C [{2},{4}] {5} {6}.{7}".format(*m),
                remember[m[4]] = m[6:8]     # store interface+member for return
            if m[0] == 'method return':
                m[6:8] = remember.pop(m[8]) # recall stored interface+member
                print "R [{3},{8}] {6}.{7}".format(*m),
            if m[0] == 'signal':
                print "S [{2}, {4}] {5} {6}.{7}".format(*m),
            sep = "("
        else:
            p = line.rstrip()               # now handle parameters
            if p[-1] in "[](){}":           # with "encapsulations":
                p = p[-1]                   #   delete spaces, "array", "dict ..."
            p = re.sub('^\s*\w*\s*', '', p) # delete spaces and data type
            if p[-1] in "])}":
                sep = ''                    # no separator before closing
            print sep+p,
            sys.stdout.softspace=0
            if p[-1] in "[](){}":   sep = ''
            else:                   sep = ', '  # separator after data item
    print ")"                       # end the previous parameter group
    

    Note that I also changed m[6:8] = remember[m[8]] to m[6:8] = remember.pop(m[8]) in order to free the memory of no longer needed interface+member data.