Search code examples
pythonpandasdataframepython-2.7concatenation

How do I copy a row from one pandas dataframe to another pandas dataframe?


I have a dataframe of data that I am trying to append to another dataframe. I have tried various ways with .append() and there has been no successful way. When I print the data from iterrows, I provide 2 ways I tried to solve the issue below: one creates an error, the other doesn't populate the dataframe with anything.

The workflow I am trying to create is create a dataframe based off of a file that contains transaction history of customer orders. I only want to create a single record per order and I am going to add other logic to update the order details based on updates in the history. By the end of the script, it will have a single record for all of the orders and the end state of those orders after iterating through the history file.

class order_manager():
    """Manages over the current state of orders"""
    
    def __init__(self,dataF, desc='NONE'):
        self.df = pd.DataFrame
        self.data = dataF
        print type(dataF)
        self.oD= self.df(data=None,columns=desc)
    
    def add_data(self,df):
        for i, row in self.data.iterrows():
            print 'row '+str(row)
            print type(row)
            df.append(self.data[i], ignore_index =True) """ This line creates and error"""
            df.append(row, ignore_index =True) """This line doesn't append anything to the dataframe."""

test = order_manager(body,header)
test.add_data(test.orderData)

Solution

  • Use .loc to enlarge the current df. See the example below.

    import pandas as pd
    import numpy as np
    
    date_rng = pd.date_range('2015-01-01', periods=200, freq='D')
    
    df1 = pd.DataFrame(np.random.randn(100, 3), columns='A B C'.split(), index=date_rng[:100])
    Out[410]: 
                     A       B       C
    2015-01-01  0.2799  0.4416 -0.7474
    2015-01-02 -0.4983  0.1490 -0.2599
    2015-01-03  0.4101  1.2622 -1.8081
    2015-01-04  1.1976 -0.7410  0.4221
    2015-01-05  1.3311  1.0399  2.2701
    ...            ...     ...     ...
    2015-04-06 -0.0432  0.6131 -0.0216
    2015-04-07  0.4224 -1.1565  2.2285
    2015-04-08  0.0663  1.2994  2.0322
    2015-04-09  0.1958 -0.4412  0.3924
    2015-04-10  0.1622  1.7603  1.4525
    
    [100 rows x 3 columns]
    
    
    df2 = pd.DataFrame(np.random.randn(100, 3), columns='A B C'.split(), index=date_rng[100:])
    Out[411]: 
                     A       B       C
    2015-04-11  1.1196 -1.9627  0.6615
    2015-04-12 -0.0098  1.7655  0.0447
    2015-04-13 -1.7318 -2.0296  0.8384
    2015-04-14 -1.5472 -1.7220 -0.3166
    2015-04-15  2.5058  0.6487  1.0994
    ...            ...     ...     ...
    2015-07-15 -1.4803  2.1703 -1.9391
    2015-07-16 -1.7595 -1.7647 -1.0622
    2015-07-17  1.7900  0.2280 -1.8797
    2015-07-18  0.7909 -0.4999  0.3848
    2015-07-19  1.2243  0.4681 -1.2323
    
    [100 rows x 3 columns]
    
    # to move one row from df2 to df1, use .loc to enlarge df1
    # this is far more efficient than pd.concat and pd.append
    df1.loc[df2.index[0]] = df2.iloc[0]
    
    Out[413]: 
                     A       B       C
    2015-01-01  0.2799  0.4416 -0.7474
    2015-01-02 -0.4983  0.1490 -0.2599
    2015-01-03  0.4101  1.2622 -1.8081
    2015-01-04  1.1976 -0.7410  0.4221
    2015-01-05  1.3311  1.0399  2.2701
    ...            ...     ...     ...
    2015-04-07  0.4224 -1.1565  2.2285
    2015-04-08  0.0663  1.2994  2.0322
    2015-04-09  0.1958 -0.4412  0.3924
    2015-04-10  0.1622  1.7603  1.4525
    2015-04-11  1.1196 -1.9627  0.6615
    
    [101 rows x 3 columns]