I loaded an excel file in a dataframe which contains questions, answers and some custom attributes in the first three columns. I would like to transpose all the answers to rows but keep the first columns with attributes by every answer.
Input:
ID | attribute_1 | attribute_2 | question_1 | question_2 | question_3 |
---|---|---|---|---|---|
1 | monday | mix1 | answer_1 | answer_2 | answer_3 |
2 | friday | mix2 | answer_1 | answer_2 | answer_3 |
Output:
ID | attribute_1 | attribute_2 | question | answer |
---|---|---|---|---|
1 | monday | mix1 | question_1 | answer_1 |
1 | monday | mix1 | question_2 | answer_2 |
1 | monday | mix1 | question_3 | answer_3 |
2 | friday | mix2 | question_1 | answer_1 |
2 | friday | mix2 | question_2 | answer_2 |
2 | friday | mix2 | question_3 | answer_3 |
I looked at df.transpose, but that won't exclude / multiply the attribute columns.
Anyone an idea? TIA
ABBOV
Try using melt
import pandas as pd
df = pd.DataFrame({'ID': [1, 2],
'attribute_1': ['monday', 'friday'],
'attribute_2': ['mix1', 'mix2'],
'question_1': ['answer_1', 'answer_1'],
'question_2': ['answer_2', 'answer_2'],
'question_3': ['answer_3', 'answer_3']})
df = df.melt(id_vars=['ID','attribute_1','attribute_2'],
var_name='question',
value_name='answer').sort_values(by='ID')
print(df)
Output
ID attribute_1 attribute_2 question answer
0 1 monday mix1 question_1 answer_1
2 1 monday mix1 question_2 answer_2
4 1 monday mix1 question_3 answer_3
1 2 friday mix2 question_1 answer_1
3 2 friday mix2 question_2 answer_2
5 2 friday mix2 question_3 answer_3