Search code examples
pythonpandassplice

Taking the 1st and 2nd, 4th and 5th etc rows from a single Pandas column and put in two new columns, Python


Below is a sample of a pandas dataframe, a single column with 1000's of rows.

I need second/third columns putting data in rows 1 and 2, 4 and 5 etc in the second/third column

                       0
 0    \\path_1\file1.csv                                                                                                                                                     
 1    23082022 DS
 2    520i2146002
 3    \\path_2\file2.csv                                                                                                                                
 4    230822 NS
 5    520i2146002
 4    \\path_3\file3.csv                                                                                                                                             
 5    24082022 DS
 6    520i2146002

Desired Output

                       0        1            2     
 0    \\path_1\file1.csv      23082022 DS    520i2146002                                                                                                                                           
 1    \\path_2\file2.csv      230822 NS      520i2146002                                                                                                                             
 2    \\path_3\file3.csv      24082022 DS    520i2146002                                                                                                                                    
        

Can only manage to pull out the odds with:

df = pd.DataFrame({0: df[0].iloc[::2].values, 'value': df[0].iloc[1::2].values})

Suggestions?


Solution

  • Make three subsets by taking every third value - starting at 0, 1, and 2. Reset each subset's index, and concat them all together.

    df = pd.concat([df.loc[::3].reset_index(drop=True),
                    df.loc[1::3].reset_index(drop=True), 
                    df.loc[2::3].reset_index(drop=True)], axis=1, ignore_index=True)
    print(df)
    

    Output:

                        0            1            2
    0  \\path_1\file1.csv  23082022 DS  520i2146002
    1  \\path_2\file2.csv    230822 NS  520i2146002
    2  \\path_3\file3.csv  24082022 DS  520i2146002
    

    Slightly more concise:

    df = pd.concat([df.loc[i::3].reset_index(drop=True) for i in range(3)],
                   axis=1, ignore_index=True)