i hope you are doing good . I have the following output :
ClassName Bugs HighBugs LowBugs NormalBugs WMC LOC
Class1 4 0 1 3 34 77
Class2 0 0 0 0 9 45
Class3 3 0 1 2 10 18
Class4 0 0 0 0 44 46
Class5 6 2 2 2 78 94
The result i want is as follow :
ClassName Bugs HighBugs LowBugs NormalBugs WMC LOC
Class1 1 0 0 1 34 77
Class1 1 0 0 1 34 77
Class1 1 0 0 1 34 77
Class1 1 0 1 0 34 77
Class2 0 0 0 0 9 45
Class3 1 0 0 1 10 18
Class3 1 0 0 1 10 18
Class3 1 0 1 0 10 18
Class4 0 0 0 0 44 46
Class5 1 0 0 1 78 94
Class5 1 0 0 1 78 94
Class5 1 0 1 0 78 94
Class5 1 0 1 0 78 94
Class5 1 1 0 0 78 94
Class5 1 1 0 0 78 94
Little explanation , what i want is to duplicate the classes depending on the column Bugs and Bugs = HighBugs + LowBugs + NormalBugs , as you can see in the result i want is that when the classes are duplicated we have only one's and zero's depending on the number of Bugs.
Thank you in advance and have a good day you all .
Try:
dfs, col_names, other_cols = (
[],
["NormalBugs", "LowBugs", "HighBugs"],
["ClassName", "WMC", "LOC"],
)
for _, row in df.iterrows():
if row["Bugs"] == 0:
dfs.append(
pd.DataFrame(
[[0, 0, 0, *[row[c] for c in other_cols]]],
columns=col_names + other_cols,
)
)
else:
for c in col_names:
dfs.append(pd.DataFrame([1] * row[c], columns=[c]))
for oc in other_cols:
dfs[-1][oc] = row[oc]
df_out = pd.concat(dfs).fillna(0)
df_out[col_names] = df_out[col_names].astype(int)
df_out["Bugs"] = df_out[col_names].any(axis=1).astype(int)
print(
df_out[
["ClassName", "Bugs", "HighBugs", "LowBugs", "NormalBugs", "WMC", "LOC"]
]
)
Prints:
ClassName Bugs HighBugs LowBugs NormalBugs WMC LOC
0 Class1 1 0 0 1 34 77
1 Class1 1 0 0 1 34 77
2 Class1 1 0 0 1 34 77
0 Class1 1 0 1 0 34 77
0 Class2 0 0 0 0 9 45
0 Class3 1 0 0 1 10 18
1 Class3 1 0 0 1 10 18
0 Class3 1 0 1 0 10 18
0 Class4 0 0 0 0 44 46
0 Class5 1 0 0 1 78 94
1 Class5 1 0 0 1 78 94
0 Class5 1 0 1 0 78 94
1 Class5 1 0 1 0 78 94
0 Class5 1 1 0 0 78 94
1 Class5 1 1 0 0 78 94
EDIT: Added more columns.