Search code examples
pythonpandasdataframemultiple-columnssmoothing

Columns must be same length as key


I have This Imputed DataFrame :

   Imputed_Df.head():


                        Atmospheric_Pressure    Global_Radiation        Net_Radiation   Precipitation   Relative_Humidity   Temperature    Wind_Direction   Wind_Speed
 Time                               
 2013-11-01 01:00:00             999.451              207.75               99.09            4.450000           39.958667         13.600000      117.231667    2.138500
 2013-11-01 05:00:00             992.760              167.77               85.16            5.746667           56.107500         11.900000      244.410000    2.313000
 2013-11-01 09:00:00             990.272              157.00               95.04            6.271000           37.113333         12.802083      297.131500    3.270350
 2013-11-01 10:00:00             998.367              191.26               82.32            4.428000           37.946500         13.800000      143.103333    2.232500

And What I want to do is basically Smooth All The Columns and than add the new smoothed columns to this DataFrame , so Here's How I Tried to Do It:

import statsmodels
from statsmodels.tsa.api import ExponentialSmoothing, SimpleExpSmoothing, Holt


def Smoothing(Col):
  for Col in Imputed_Df.columns:
    fit = SimpleExpSmoothing(Imputed_Df[Col]).fit(smoothing_level=0.2, optimized=False)
    fcast = fit.predict(start=Imputed_Df.index.min(), end=Imputed_Df.index.max())
    return fcast



 Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']] = Imputed_Df.apply(Smoothing, axis=1)

But I got This Error:

 Columns must be same length as key

Any Suggestion Would Be Much Appreciated , Thank U.


Solution

  • Assuming your Smoothing Function is Correct

    print(Imputed_Df.apply(Smoothing, axis=1))
    

    and check count of returned columns from your df it should match 8 as you are taking Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']]

    if output df columns counts are not 8 then try

    import statsmodels
    from statsmodels.tsa.holtwinters import SimpleExpSmoothing
    def Smoothing(Imputed_Df):
        my_df = Imputed_Df.copy()
        for Col in Imputed_Df.columns:
            fit = SimpleExpSmoothing(Imputed_Df[Col]).fit(smoothing_level=0.2, optimized=False)
            my_df[Col] = fit.predict(start=Imputed_Df.index.min(), end=Imputed_Df.index.max())
        return my_df
    

    Actually I found Time Series data is irregular.

    Imputed_Df = Imputed_Df.resample('H').pad() ##
    Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']] = Smoothing(Imputed_Df)
    

    I'll prefer writing this way

    Imputed_Df[Imputed_Df.columns + "_SES"]= Smoothing(Imputed_Df)