Search code examples
pythonpandasdataframetransformtransport

Transporting data to create a new data frame to re-structure the data frame


I have a csv file that contains data along the rows. I want to create a data frame by transporting all rows to columns.I have provided the following sample data frame that replicates the data in csv.

data = pd.DataFrame()
data ['Name'] = ['A','B','C']
data ['Home'] = ['Z','Y','X']
data ['1-0'] = [2.2,3.1,3]
data['2-0'] = [1.5,5,8]
data ['2-1'] = [5,2,3]
data ['HW'] = [1.2,3.2,4]

I want to convert the columns '1-0','2-0','2-1' to a column of a new data frame and its corresponding data to a new columns says Score. The first column ** Name** will be used as a new column but duplicates for all score.

The following is the expected output.

enter image description here

In the new data frame, the element from first column Name will be duplicated. The elements in the second column Score is some of the columns name from old data frame. The rest columns such as HW will be duplicated as the same as the name column.

May I know how to transform it? Your advice is much appreciated. Thanks. Zep.


Solution

  • Try melt:

    data.melt(['Name','Home','HW']).sort_values('Name')
    

    Output:

      Name Home   HW variable  value
    0    A    Z  1.2      1-0    2.2
    3    A    Z  1.2      2-0    1.5
    6    A    Z  1.2      2-1    5.0
    1    B    Y  3.2      1-0    3.1
    4    B    Y  3.2      2-0    5.0
    7    B    Y  3.2      2-1    2.0
    2    C    X  4.0      1-0    3.0
    5    C    X  4.0      2-0    8.0
    8    C    X  4.0      2-1    3.0
    

    Also,you can use set_index and stack:

    data.set_index(['Name','Home','HW']).stack().reset_index()
    

    Output:

      Name Home   HW variable  value
    0    A    Z  1.2      1-0    2.2
    3    A    Z  1.2      2-0    1.5
    6    A    Z  1.2      2-1    5.0
    1    B    Y  3.2      1-0    3.1
    4    B    Y  3.2      2-0    5.0
    7    B    Y  3.2      2-1    2.0
    2    C    X  4.0      1-0    3.0
    5    C    X  4.0      2-0    8.0
    8    C    X  4.0      2-1    3.0