Search code examples
pythondata-miningtracepattern-recognitionrepeat

How to find sequence of repetition from a trace file


I Have been given a assignment where i need to instrument a given application and generate a trace file and later on from the trace file a sequence diagram needs to be generated. The application is written in python.The application was instrumented in the places where the method starts and exits.

My main purpose is to find the repetitive patterns in the trace file ?

The following is a sample of the trace file

Entering    get_instance    None    []  None    10:25:30:743000
Entering    __init__    ConfigHandler   ['config_filepath'] 56663624    10:25:30:743000
Entering    _load_config    ConfigHandler   ['path']    56663624    10:25:30:744000
Exited  _load_config    ConfigHandler   True    56663624    10:25:30:746000
Exited  __init__    ConfigHandler   None    56663624    10:25:30:747000
Exited  get_instance    None    <commons.ConfigHandler.ConfigHandler object at 0x0000000003609E48>  None    10:25:30:747000
Entering    __init__    ColumnConverter []  56963312    10:25:30:769000
Exited  __init__    ColumnConverter None    56963312    10:25:30:769000
Entering    __init__    PredicatesFactory   []  56963424    10:25:30:769000
Exited  __init__    PredicatesFactory   None    56963424    10:25:30:769000
Entering    __init__    LogFileConverter    []  56963536    10:25:30:769000
Exited  __init__    LogFileConverter    None    56963536    10:25:30:769000

how to find patterns of repetitions in a trace file ?

My main purpose is to find the repetitive patterns in the trace file ?


Solution

  • You can use the PrefixSpan algorithm to find sequential rules.

    The paper:

    http://www.cs.uiuc.edu/~hanj/pdf/span01.pdf

    This site has open-source Java code, that you can gain inspiration from:

    http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php#examplePrefixSpan