Search code examples
pythonpandasdataframeexport-to-csv

Adding data to csv using pandas dataframe, by adding new column


I want to append to a csv file, some data from redshift tables, using the pandas module in python. From python, I can successfully connect and retrieve rows from redshift tables using the psycopg2 module. Now, I am storing datewise data on the csv. So I need to first create a new date column in the csv, then append the data retrieved in that new column.

I am using the following commands to read from redshift tables:

conn=psycopg2.connect( host='my_db_hostname', port=`portnumber`, user='username', password='password', dbname='db')
conn.autocommit = True
cur=conn.cursor()
cur.execute(""" select  emp_name, emp_login_count from public.emp_login_detail where login_date=current_date  """)
records=cur.fetchall()
cur.close()

Now, I want to append these emp_name and emp_login_count columns to the existing csv. Below is a snapshot of csv:

enter image description here

Everyday I need to add new date column in csv and then I need to put the emp_login_count against respective person's name.
I am new to Pandas and have no idea how to implement this. Can someone please help me out?


Solution

  • Add the following and try it out:

    records=cur.fetchall()
    
    # Create a dataframe of the SQL query's result 
    column_names = ['emp_name','login_count']
    df = pd.DataFrame(records, columns = column_names)
    df.head()
    

    Now create another dataframe for the daily login counts csv file

    df_daily = pd.read_csv('<INSERT the path_to_csv here>')
    df_daily.head()
    

    Merge the two dataframes on the 'emp_name' column

    result = df.merge(df_daily, on='emp_name')
    result.head()
    

    After the join, you can rename the 'login_count' column to today's date

    result.rename(columns = {'login_count':'< INSERT date here>'}, inplace = True)
    

    You can then save the new data into a csv file again:

    pd.to_csv('<INSERT name of file.csv>', index=False)