I have generated benchmarks for comparing two approaches taken for scaling video files using ffmpeg tool.
The benchmarks are logged in this format :
x.mp4 Output_Resolution : 10 p
Parameter1 : a
Method : A
real 0m5.788s
user 0m16.112s
sys 0m0.313s
Method : B, ParameterB1 : b11
ParameterB2 : b21
real 0m6.637s
user 0m16.618s
sys 0m0.720s
ParameterB2 : b22
real 0m5.486s
user 0m17.570s
sys 0m0.568s
ParameterB2 : b23
real 0m5.232s
user 0m18.212s
sys 0m0.718s
Method : B, ParameterB1 : b12
ParameterB2 : b21
real 0m6.398s
user 0m16.790s
sys 0m0.528s
ParameterB2 : b22
real 0m5.449s
user 0m17.229s
sys 0m0.533s
ParameterB2 : b23
real 0m5.275s
user 0m18.411s
sys 0m0.522s
##################################################################################################################
Parameter1 : b
Method : A
real 0m5.927s
user 0m16.451s
sys 0m0.308s
Method : B, ParameterB1 : b11
ParameterB2 : b21
real 0m6.685s
user 0m17.044s
sys 0m0.597s
ParameterB2 : b22
real 0m5.942s
user 0m18.971s
sys 0m0.804s
ParameterB2 : b23
real 0m6.119s
user 0m20.869s
sys 0m0.792s
.
.
.
There are two methods (A and B). Method A and B share Parameter1 which can take values a,b,c...
.
Method B has other parameters B1 and B2. ParameterB1 and ParameterB2 take values b11,b12,b13...
and b21,b22,b23...
respectively. A line separator (which consists of multiple #
) is used to separate the measurements for different values of Parameter1.
I would like to view the benchmarks in tabular format.
+--------+---------------------------------------+----------------+----------------+----------------+
| Method | | Parameter1 (a) | Parameter1 (b) | Parameter1 (c) |
+--------+---------------------------------------+----------------+----------------+----------------+
| A | NA | 4.03s | 3.23s | 1.4s |
+--------+-------------------+-------------------+----------------+----------------+----------------+
| B | ParameterB1 (b11) | ParameterB2 (b21) | . | | |
| | +-------------------+----------------+----------------+----------------+
| | | ParameterB2 (b22) | . | | |
| | +-------------------+----------------+----------------+----------------+
| | | ParameterB2 (b23) | . | | |
| +-------------------+-------------------+----------------+----------------+----------------+
| | ParameterB1 (b12) | ParameterB2 (b21) | . | | |
| | +-------------------+----------------+----------------+----------------+
| | | ParameterB2 (b22) | . | | |
| | +-------------------+----------------+----------------+----------------+
| | | ParameterB2 (b23) | . | | |
| +-------------------+-------------------+----------------+----------------+----------------+
| | ParameterB1 (b12) | ParameterB2 (b21) | . | | |
| | +-------------------+----------------+----------------+----------------+
| | | ParameterB2 (b22) | . | | |
| | +-------------------+----------------+----------------+----------------+
| | | ParameterB2 (b23) | . | | |
+--------+-------------------+-------------------+----------------+----------------+----------------+
The cell values consists of the real time values in seconds (real 0m6.119s
).
How can I generate such a table using python?
I have written a "not-so-efficient" python script with the help from a previous answer from a similar question I asked a few months ago.
import pprint
def gettime(x):
m,s = map(float,x[:-1].split('m'))
return 60 * m + s
with open("log") as fp:
lines = fp.read().splitlines()
idx = 0
A = {}
B = {}
while idx < len(lines):
if "Parameter1" in lines[idx]:
Parameter1 = lines[idx].split(' ')[-1]
temp1 = {}
idx += 2
if "A" in lines[idx]:
idx += 2
A[Parameter1] = gettime(lines[idx].split('\t')[-1])
while idx < len(lines):
if "B" in lines[idx]:
ParameterB1 = lines[idx].split(' ')[-1]
temp2 = {}
idx += 1
while idx < len(lines):
if "ParameterB2" in lines[idx]:
ParameterB2 = lines[idx].split(' ')[-1]
idx += 2
temp2[ParameterB2] = gettime(lines[idx].split('\t')[-1])
elif "#" in lines[idx] or "B" in lines[idx]:
break
idx += 1
temp1[ParameterB1] = temp2
elif "#" in lines[idx]:
B[Parameter1] = temp1
break
else:
idx += 1
else:
idx += 1
print("A")
print(A)
pp = pprint.PrettyPrinter(sort_dicts = False, depth = 4)
print("B")
pp.pprint(B)
This script parses the log and stores the measurements obtained for respective methods and parameters in a dictionary.
Example output from the script :
A
{'a': 4.03, 'b': 3.23, 'c': 1.4}
B
{'a': {'b21': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
'b22': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
'b23': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0}},
'b': {'b21': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
'b22': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
'b23': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0}},
'c': {'b21': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
'b22': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0},
'b23': {'b11': 0.0, 'b12': 0.0, 'b13': 0.0}}}
How can I print this in tabular format as described above?
Extended the python script (in the question) further to represent the the data stored in the dictionaries in tabular format using pretty table.
import pprint
import io
from prettytable import PrettyTable
# install PTable package
def gettime(x):
m,s = map(float,x[:-1].split('m'))
return 60 * m + s
with open("log") as fp:
lines = fp.read().splitlines()
idx = 0
A = {}
B = {}
Parameter1_list = []
ParameterB1_list = []
ParameterB2_list = []
while idx < len(lines):
if "Parameter1" in lines[idx]:
Parameter1 = lines[idx].split(' ')[-1]
Parameter1_list.append(Parameter1)
temp1 = {}
idx += 2
if "A" in lines[idx]:
idx += 2
A[Parameter1] = gettime(lines[idx].split('\t')[-1])
while idx < len(lines):
if "B" in lines[idx]:
ParameterB1 = lines[idx].split(' ')[-1]
ParameterB1_list.append(ParameterB1)
temp2 = {}
idx += 1
while idx < len(lines):
if "ParameterB2" in lines[idx]:
ParameterB2 = lines[idx].split(' ')[-1]
ParameterB2_list.append(ParameterB2)
idx += 2
temp2[ParameterB2] = gettime(lines[idx].split('\t')[-1])
elif "#" in lines[idx] or "B" in lines[idx]:
break
idx += 1
temp1[ParameterB1] = temp2
elif "#" in lines[idx]:
B[Parameter1] = temp1
break
else:
idx += 1
elif ".mp4" in lines[idx]:
title = lines[idx]
idx += 1
else:
idx += 1
#print("A")
#print(A)
#pp = pprint.PrettyPrinter(sort_dicts=False,depth=4)
#print("B")
#pp.pprint(B)
Parameter1 = list(dict.fromkeys(Parameter1_list))
ParameterB1 = list(dict.fromkeys(ParameterB1_list))
ParameterB2 = list(dict.fromkeys(ParameterB2_list))
t1 = PrettyTable(['Method','ParameterB1','ParameterB2'])
t2 = PrettyTable(Parameter1)
t1.title = title
t2.title = "Parameter1"
t1.add_row(['A','NA','NA'])
t2.add_row(A.values())
for d in ParameterB1:
for c in ParameterB2:
values = []
for e in Parameter1:
values.append(B[e][d][c])
t1.add_row(['B',d,c])
t2.add_row(values)
o1 = io.StringIO(t1.get_string())
o2 = io.StringIO(t2.get_string())
with open(0,"w") as f1, open('result.txt',"w") as f2:
for x,y in zip(o1,o2):
f1.write(x.strip()[:-1] + y.strip() + "\n")
f2.write(x.strip()[:-1] + y.strip() + "\n")
This writes the table to a file (result.txt
) and stdout.
Output :
+------------------------------------+-------------------+
| x.mp4 Output Resolution : 10p | Parameter1 |
+--------+-------------+-------------+------+------+-----+
| Method | ParameterB1 | ParameterB2 | a | b | c |
+--------+-------------+-------------+------+------+-----+
| A | NA | NA | 4.03 | 3.23 | 1.4 |
| B | b11 | b21 | 0.0 | 0.0 | 0.0 |
| B | b11 | b22 | 0.0 | 0.0 | 0.0 |
| B | b11 | b23 | 0.0 | 0.0 | 0.0 |
| B | b12 | b21 | 0.0 | 0.0 | 0.0 |
| B | b12 | b22 | 0.0 | 0.0 | 0.0 |
| B | b12 | b23 | 0.0 | 0.0 | 0.0 |
| B | b13 | b21 | 0.0 | 0.0 | 0.0 |
| B | b13 | b22 | 0.0 | 0.0 | 0.0 |
| B | b13 | b23 | 0.0 | 0.0 | 0.0 |
+--------+-------------+-------------+------+------+-----+
This is the closest I could represent the data in tabular format that I have described in the question.