Search code examples
regexstring-matchingk6

Parse output from k6 data to get specific information


I am trying to extract data from a k6 output (https://docs.k6.io/docs/results-output):

data_received.........: 246 kB 21 kB/s
data_sent.............: 174 kB 15 kB/s
http_req_blocked......: avg=26.24ms  min=0s      med=13.5ms  max=145.27ms p(90)=61.04ms p(95)=70.04ms 
http_req_connecting...: avg=23.96ms  min=0s      med=12ms    max=145.27ms p(90)=57.03ms p(95)=66.04ms 
http_req_duration.....: avg=197.41ms min=70.32ms med=91.56ms max=619.44ms p(90)=288.2ms p(95)=326.23ms
http_req_receiving....: avg=141.82µs min=0s      med=0s      max=1ms      p(90)=1ms     p(95)=1ms     
http_req_sending......: avg=8.15ms   min=0s      med=0s      max=334.23ms p(90)=1ms     p(95)=1ms     
http_req_waiting......: avg=189.12ms min=70.04ms med=91.06ms max=343.42ms p(90)=282.2ms p(95)=309.22ms
http_reqs.............: 190    16.054553/s
iterations............: 5      0.422488/s
vus...................: 200    min=200 max=200
vus_max...............: 200    min=200 max=200

The data comes in the above format and I am trying to find a way to get each line in the above along with the values only. As an example:

http_req_duration: 197.41ms, 70.32ms,91.56ms, 619.44ms, 288.2ms, 326.23ms

I have to do this for ~50-100 files and want to find a RegEx or similar quicker way to do it, without writing too much code. Is it possible?


Solution

  • Here's a simple Python solution:

    import re
    
    FIELD = re.compile(r"(\w+)\.*:(.*)", re.DOTALL)  # split the line to name:value
    VALUES = re.compile(r"(?<==).*?(?=\s|$)")  # match individual values from http_req_* fields
    
    # open the input file `k6_input.log` for reading, and k6_parsed.log` for parsing
    with open("k6_input.log", "r") as f_in, open("k6_parsed.log", "w") as f_out:
        for line in f_in:  # read the input file line by line
            field = FIELD.match(line)  # first match all <field_name>...:<values> fields
            if field:
                name = field.group(1)  # get the field name from the first capture group
                f_out.write(name + ": ")  # write the field name to the output file
                value = field.group(2)  # get the field value from the second capture group
                if name[:9] == "http_req_":  # parse out only http_req_* fields
                    f_out.write(", ".join(VALUES.findall(value)) + "\n")  # extract the values
                else:  # verbatim copy of other fields
                    f_out.write(value)
            else:  # encountered unrecognizable field, just copy the line
                f_out.write(line)
    

    For a file with contents as above you'll get a resulting:

    data_received:  246 kB 21 kB/s
    data_sent:  174 kB 15 kB/s
    http_req_blocked: 26.24ms, 0s, 13.5ms, 145.27ms, 61.04ms, 70.04ms
    http_req_connecting: 23.96ms, 0s, 12ms, 145.27ms, 57.03ms, 66.04ms
    http_req_duration: 197.41ms, 70.32ms, 91.56ms, 619.44ms, 288.2ms, 326.23ms
    http_req_receiving: 141.82µs, 0s, 0s, 1ms, 1ms, 1ms
    http_req_sending: 8.15ms, 0s, 0s, 334.23ms, 1ms, 1ms
    http_req_waiting: 189.12ms, 70.04ms, 91.06ms, 343.42ms, 282.2ms, 309.22ms
    http_reqs:  190    16.054553/s
    iterations:  5      0.422488/s
    vus:  200    min=200 max=200
    vus_max:  200    min=200 max=200

    If you have to run it over many files, I'd suggest you to investigate os.glob(), os.walk() or os.listdir() to list all the files you need and then loop over them and execute the above, thus further automating the process.