Search code examples
pythonpandasdataframemergerows

Selecting specific rows in pandas dataframe merging


I have 4 dataframes in the array that I keep.

     0    1    2
0  0.0  1.0  2.0
1  0.0  1.0  2.0
2  0.0  1.0  2.0
3  0.0  1.0  2.0
4  0.0  2.0  3.0
5  0.0  2.0  3.0
6  0.0  3.0  4.0
7  0.0  3.0  4.0

     0    1    2
0  1.0  4.0  4.0
1  1.0  5.0  5.0

     0    1    2
0  2.0  6.0  4.0

     0    1    2
0  3.0  7.0  6.0

Roughly what I want to do is combine these dataframes according to certain conditions. For example, adding data frames starting with the value in column 1 in row 0 in the first data frame to the first data frame. As a result, the first dataframe takes a format like this:

     0    1    2   3    4    5
0  0.0  1.0  2.0
1  0.0  1.0  2.0
2  0.0  1.0  2.0 1.0  4.0  4.0
3  0.0  1.0  2.0 1.0  5.0  5.0
4  0.0  2.0  3.0
5  0.0  2.0  3.0 2.0  6.0  4.0
6  0.0  3.0  4.0
7  0.0  3.0  4.0 3.0  7.0  6.0

The state of the data in dict format.

{0: {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0}, 1: {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 2.0, 5: 2.0, 6: 3.0, 7: 3.0}, 2: {0: 2.0, 1: 2.0, 2: 2.0, 3: 2.0, 4: 3.0, 5: 3.0, 6: 4.0, 7: 4.0}}
{0: {0: 1.0, 1: 1.0}, 1: {0: 4.0, 1: 5.0}, 2: {0: 4.0, 1: 5.0}}
{0: {0: 2.0}, 1: {0: 6.0}, 2: {0: 4.0}}
{0: {0: 3.0}, 1: {0: 7.0}, 2: {0: 6.0}}

I also tried various algorithms and could not solve the problem in this part.


Solution

  • First, rename the columns of df2, df3, df4 from 0,1,2 to 3,4,5

    for df in [df2, df3, df4]:
        df.rename(columns={0:3, 1:4, 2:5}, inplace=True)
    

    Second, change the index of these columns to the row index where you want to append them in df1

    df2.index = [2,3]
    df3.index = [5]
    df4.index = [7]
    

    Now, you can use two consecutive pd.concat, to have the expected dataframe output. The first concat concatenates df2, df3, and df4 in the row, second concat concatenates this output to df1 on column, something like this:

    pd.concat([df1,pd.concat([df2, df3, df4], axis=0)], axis=1 )
    

    OUTPUT

         0    1    2    3    4    5
    0  0.0  1.0  2.0  NaN  NaN  NaN
    1  0.0  1.0  2.0  NaN  NaN  NaN
    2  0.0  1.0  2.0  1.0  4.0  4.0
    3  0.0  1.0  2.0  1.0  5.0  5.0
    4  0.0  2.0  3.0  NaN  NaN  NaN
    5  0.0  2.0  3.0  2.0  6.0  4.0
    6  0.0  3.0  4.0  NaN  NaN  NaN
    7  0.0  3.0  4.0  3.0  7.0  6.0