I have a csv file that contains data along the rows. I want to create a data frame by transporting all rows to columns.I have provided the following sample data frame that replicates the data in csv.
data = pd.DataFrame()
data ['Name'] = ['A','B','C']
data ['Home'] = ['Z','Y','X']
data ['1-0'] = [2.2,3.1,3]
data['2-0'] = [1.5,5,8]
data ['2-1'] = [5,2,3]
data ['HW'] = [1.2,3.2,4]
I want to convert the columns '1-0','2-0','2-1' to a column of a new data frame and its corresponding data to a new columns says Score. The first column ** Name** will be used as a new column but duplicates for all score.
The following is the expected output.
In the new data frame, the element from first column Name will be duplicated. The elements in the second column Score is some of the columns name from old data frame. The rest columns such as HW will be duplicated as the same as the name column.
May I know how to transform it? Your advice is much appreciated. Thanks. Zep.
Try melt
:
data.melt(['Name','Home','HW']).sort_values('Name')
Output:
Name Home HW variable value
0 A Z 1.2 1-0 2.2
3 A Z 1.2 2-0 1.5
6 A Z 1.2 2-1 5.0
1 B Y 3.2 1-0 3.1
4 B Y 3.2 2-0 5.0
7 B Y 3.2 2-1 2.0
2 C X 4.0 1-0 3.0
5 C X 4.0 2-0 8.0
8 C X 4.0 2-1 3.0
Also,you can use set_index
and stack
:
data.set_index(['Name','Home','HW']).stack().reset_index()
Output:
Name Home HW variable value
0 A Z 1.2 1-0 2.2
3 A Z 1.2 2-0 1.5
6 A Z 1.2 2-1 5.0
1 B Y 3.2 1-0 3.1
4 B Y 3.2 2-0 5.0
7 B Y 3.2 2-1 2.0
2 C X 4.0 1-0 3.0
5 C X 4.0 2-0 8.0
8 C X 4.0 2-1 3.0