Parse output from k6 data to get specific information

I am trying to extract data from a k6 output (https://docs.k6.io/docs/results-output):

data_received.........: 246 kB 21 kB/s
data_sent.............: 174 kB 15 kB/s
http_req_blocked......: avg=26.24ms  min=0s      med=13.5ms  max=145.27ms p(90)=61.04ms p(95)=70.04ms 
http_req_connecting...: avg=23.96ms  min=0s      med=12ms    max=145.27ms p(90)=57.03ms p(95)=66.04ms 
http_req_duration.....: avg=197.41ms min=70.32ms med=91.56ms max=619.44ms p(90)=288.2ms p(95)=326.23ms
http_req_receiving....: avg=141.82µs min=0s      med=0s      max=1ms      p(90)=1ms     p(95)=1ms     
http_req_sending......: avg=8.15ms   min=0s      med=0s      max=334.23ms p(90)=1ms     p(95)=1ms     
http_req_waiting......: avg=189.12ms min=70.04ms med=91.06ms max=343.42ms p(90)=282.2ms p(95)=309.22ms
http_reqs.............: 190    16.054553/s
iterations............: 5      0.422488/s
vus...................: 200    min=200 max=200
vus_max...............: 200    min=200 max=200

The data comes in the above format and I am trying to find a way to get each line in the above along with the values only. As an example:

http_req_duration: 197.41ms, 70.32ms,91.56ms, 619.44ms, 288.2ms, 326.23ms

I have to do this for ~50-100 files and want to find a RegEx or similar quicker way to do it, without writing too much code. Is it possible?

Solution

Here's a simple Python solution:

import re

FIELD = re.compile(r"(\w+)\.*:(.*)", re.DOTALL)  # split the line to name:value
VALUES = re.compile(r"(?<==).*?(?=\s|$)")  # match individual values from http_req_* fields

# open the input file `k6_input.log` for reading, and k6_parsed.log` for parsing
with open("k6_input.log", "r") as f_in, open("k6_parsed.log", "w") as f_out:
    for line in f_in:  # read the input file line by line
        field = FIELD.match(line)  # first match all <field_name>...:<values> fields
        if field:
            name = field.group(1)  # get the field name from the first capture group
            f_out.write(name + ": ")  # write the field name to the output file
            value = field.group(2)  # get the field value from the second capture group
            if name[:9] == "http_req_":  # parse out only http_req_* fields
                f_out.write(", ".join(VALUES.findall(value)) + "\n")  # extract the values
            else:  # verbatim copy of other fields
                f_out.write(value)
        else:  # encountered unrecognizable field, just copy the line
            f_out.write(line)

For a file with contents as above you'll get a resulting:

data_received:  246 kB 21 kB/s
data_sent:  174 kB 15 kB/s
http_req_blocked: 26.24ms, 0s, 13.5ms, 145.27ms, 61.04ms, 70.04ms
http_req_connecting: 23.96ms, 0s, 12ms, 145.27ms, 57.03ms, 66.04ms
http_req_duration: 197.41ms, 70.32ms, 91.56ms, 619.44ms, 288.2ms, 326.23ms
http_req_receiving: 141.82µs, 0s, 0s, 1ms, 1ms, 1ms
http_req_sending: 8.15ms, 0s, 0s, 334.23ms, 1ms, 1ms
http_req_waiting: 189.12ms, 70.04ms, 91.06ms, 343.42ms, 282.2ms, 309.22ms
http_reqs:  190    16.054553/s
iterations:  5      0.422488/s
vus:  200    min=200 max=200
vus_max:  200    min=200 max=200

If you have to run it over many files, I'd suggest you to investigate os.glob(), os.walk() or os.listdir() to list all the files you need and then loop over them and execute the above, thus further automating the process.