I have a df that looks like this:
id.1.value.1 id.2.value.2 id.1.question id.2.value.2
TRUE FALSE TRUE TRUE
I want to create logic that scans the column names of the df and extracts the last number only from column names that have value
in column name and compare the value in the cell of the column that contains value
with following logic:
If value in column value
equal TRUE
then compare the last number in multi value dictionary
use second value in multi key dictionary to create dataframe column names
Example:
my_dict = {1: ('a', 'category'),2: ('b', 'category'),\
3: ('c', 'category'),4:('d','category'),\
5:('e','subcategory'),6:('f','subcategory'),\
7:('g','subcategory'),8:('h','subcategory'),\
9:('i','subcategory'),10:('j','subcategory'),\
11:('k','subcategor'),12:('l','subcategory'),\
13:('m','subcategory'),14:('n','subcategory'),\
15:('o','subcategory'),16:('p','subcategory'),\
17:('q','subcategory'),18:('r','subcategory'),\
19:('s','subcategory'),20:('t','subcategory'),\
21:('u','subcategory'),22:('v','subcategory'),\
23:('w','subcategory'),24:('x','subcategory')
}
If my current df looks likes this:
id.1.value.1 id.2.value.2 id.1.question id.6.value.6
TRUE FALSE TRUE TRUE
New df should look like this:
category subcategory
a f
names = df.columns
new_df = pd.DataFrame()
for name in names:
if ('value' in name) & df[name][0]:
last_number = int(name[-1])
key, value = my_dict[last_number]
try:
new_df[value][0] = list(new_df[value][0]) + [key]
except:
new_df[value] = [key]