Search code examples
pythonparsingpyparsing

Parsing file that has nested loop structures into list structure using python


I am struggling to parse an FPGA simulation file (.vwf), specifically at the point where the input waveforms are specified using a kind of nested loop system. An example of the file format is:

TRANSITION_LIST("ADDR[0]")
{
    NODE
    {
        REPEAT = 1;
        LEVEL 0 FOR 100.0;
        LEVEL 1 FOR 100.0;
        NODE
        {
            REPEAT = 3;
            LEVEL 0 FOR 100.0;
            LEVEL 1 FOR 100.0;
            NODE
            {
                REPEAT = 2;
                LEVEL 0 FOR 200.0;
                LEVEL 1 FOR 200.0;
            }
        }
        LEVEL 0 FOR 100.0;
    }
}

So this means that the channel named "ADDR[0]" has its logic value switched as follows:

LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 100.0;

I have set out to try and get this information into a list structure that looks like:

[[0, 100], [1, 100], [0, 100], [1, 100], [0, 200], [1, 200], [0, 200], [1, 200], [0, 100], [1, 100], [0, 200], [1, 200], [0, 200], [1, 200], [0, 100], [1, 100], [0, 200], [1, 200], [0, 200], [1, 200], [0, 200]]

However, I am struggling to come up with how to do this. I had attempted something that I though worked but upon revisiting it I spotted my mistakes.

import pyparsing as pp


def get_data(LINES):
    node_inst = []
    total_inst = []
    r = []
    c = 0

    rep_search = pp.Literal('REPEAT = ') + pp.Word(pp.nums)
    log_search = pp.Literal('LEVEL') + pp.Word('01') + pp.Literal('FOR') + pp.Word(pp.nums + '.')
    bra_search = pp.Literal('}')

    for line in LINES:
        print(line)
        rep = rep_search.searchString(line)
        log = log_search.searchString(line)
        bra = bra_search.searchString(line)

        if rep:
            #print(line)
            c += 1
            if c > 1: # no logic values have been found when c == 1
                for R in range(r[-1]):
                    for n in node_inst:
                        total_inst.append(n)
                node_inst = []
            r.append(int(rep[0][-1]))

        elif log:
            #print(line)
            node_inst.append([int(log[0][1]),
                              int(round(1000 * float(log[0][-1])))])

        elif bra:
            #print(line)
            if node_inst:
                for R in range(r[-1]):
                    for n in node_inst:
                        total_inst.append(n)
                node_inst = []
            if r:
                del r[-1]

    return total_inst

where essentially r is a list that keeps track of the repeat values but deletes the last value if a '}' is encountered. This produces something close but any values within the loop that repeats 2 times will only be repeated 2 times instead of being a part of the loop that repeats 3 times as well.

Any help or tips would be appreciated. I am just drawing a blank with what is some pretty rough scripting. Anything to do with my code can be changed but the input file format cannot to my knowledge.


Solution

  • Something like that, consider that it heavily depends on formatting.

    import re
    
    class Node:
        def __init__(self, parent, items, repeat):
            self.parent = parent
            self.items = items
            self.repeat = repeat
    
    root = Node(None, [], 1)
    node = root
    
    with open('levels.txt') as f:
        for line in f:
            line = line.strip()
            if line == 'NODE':
                new_node = Node(node, [], 1)
                node.items.append(new_node)
                node = new_node
    
            if line == '}':
                node = node.parent
    
            res = re.match('REPEAT = (\d+)', line)
            if res:
                node.repeat=int(res.group(1))
    
            res = re.match('LEVEL (\d+) FOR ([\d.]+)', line)
            if res:
                node.items.append((int(res.group(1)), float(res.group(2))))
    
    
    def generate(node):
        res = []
        for i in xrange(node.repeat):
            for item in node.items:
                if isinstance(item, Node):
                    res.extend(generate(item))
                elif isinstance(item, tuple):
                    res.append(item)
        return res
    
    res = generate(root)
    
    for r in res:
        print r