Search code examples
pythondataframeexport-to-csv

Write two dataframes to two different columns in CSV


I converted two arrays into two dataframes and would like to write them to a CSV file in two separate columns. There are no common columns in the dataframes. I tried the solutions as follows and also from stack exchange but did not get the result. Solution 2 has no error but it prints all the data into one column. I am guessing that is a problem with how the arrays are converted to df? I basically want two column values of Frequency and PSD exported to csv. How do I do that ?

Solution 1:

df_BP_frq = pd.DataFrame(freq_BP[L_BP], columns=['Frequency'])
df_BP_psd = pd.DataFrame(PSDclean_BP[L_BP], columns=['PSD'])

df_BP_frq['tmp'] = 1
df_BP_psd['tmp'] = 1
df_500 = pd.merge(df_BP_frq, df_BP_psd, on=['tmp'], how='outer')
df_500 = df_500.drop('tmp', axis=1)

Error: Unable to allocate 2.00 TiB for an array with shape (274870566961,) and data type int64

Solution 2:

df_BP_frq = pd.DataFrame(freq_BP[L_BP], columns=['Frequency'])
df_BP_psd = pd.DataFrame(PSDclean_BP[L_BP], columns=['PSD'])

df_500 = df_BP_frq.merge(df_BP_psd, left_on='Frequency', right_on='PSD', how='outer')

No Error. Result: The PSD values are all 0 and are seen below the frequency values in the lower rows. enter image description here

Solution 3:

df_BP_frq = pd.DataFrame(freq_BP[L_BP], columns=['Frequency'])
df_BP_psd = pd.DataFrame(PSDclean_BP[L_BP], columns=['PSD'])

df_500 = pd.merge(df_BP_frq, df_BP_psd, on='tmp').ix[:, ('Frequency','PSD')]

Error: KeyError: 'tmp'

Exporting to csv using:

df_500.to_csv("PSDvalues500.csv", index = False, sep=',', na_rep = 'N/A', encoding = 'utf-8')

Solution

  • You can use directly store the array as columns of the dataframe. If the lengths of both arrays is same, the following method would work.

    df_500 = pd.DataFrame()
    df_500['Frequency'] = freq_BP[L_BP]
    df_500['PSD'] = PSDclean_BP[L_BP]
    

    If the lengths of the arrays are different, you can convert them to series and then add them as columns in the following way. This would make add nan for empty values in the dataframe.

    df_500 = pd.DataFrame()
    df_500['Frequency'] = pd.Series(freq_BP[L_BP])
    df_500['PSD'] = pd.Series(PSDclean_BP[L_BP])