Search code examples
pythonpandasdata-analysisdata-manipulation

How to split the column into two under same column name and having same values in Python?


I am trying to split (internal split) a column so that there are two columns under the same column name and both those column should have the same values.

I am trying to implement the following in Python using Pandas library

Say My data-frame looks like this

        Column1  Column2
Row1       1        2
Row2       3        4 

Desired Output:

         Column1  Column2
 Row1     1 | 1    2 | 2
 Row2     3 | 3    4 | 4 

Solution

  • If we begin with your example:

    df = pd.DataFrame({'Column1': [1, 3],
                       'Column2': [2,4]},
                      index = ['Row1', 'Row2'])
    
    df
    
          Column1   Column2
    Row1        1         2
    Row2        3         4
    
    1. Use pd.concat and interleave methods to make a duplicate of each column and ensure that each column's duplicate appears just to its right:
    from toolz import interleave
    df = pd.concat([df, df], axis=1)[list(interleave([df]))]
    

    At this point, the dataframe looks like:

    df
    
         Column1 Column1 Column2 Column2
    Row1       1       1       2       2
    Row2       3       3       4       4
    
    1. Now change the dataframe's columns to a MultiIndex:
    df.columns = pd.MultiIndex(levels=[['Column1', 'Column2'], ['A', 'B']],
               labels=[[0, 0, 1, 1], [0, 1, 0, 1]])
    

    Which results in a dataframe that now appears like this:

    df
    
            Column1   Column2
            A   B     A   B
    Row1    1   1     2   2
    Row2    3   3     4   4