Search code examples
pythonpandasdataframenumpynormalization

Python Dataframe Normalization ValueError: Columns must be same length as key


I am trying to create a function for normalizing a given dataframe for the requested frequency.

code:

import numpy as np
import pandas as pd
def timeseries_dataframe_normalized(df, normalization_freq = 'complete'):
    """
    Input: 
        df : dataframe 
             input dataframe
        normalization_freq : string
            'daily', 'weekly', 'monthly','quarterly','yearly','complete' (default)
    Return: normalized dataframe
    
    """
    # auxiliary dataframe
    adf = df.copy()
    # convert columns to float
    # Ref: https://stackoverflow.com/questions/15891038/change-column-type-in-pandas
    adf = adf.astype(float)
    # normalized columns
    nor_cols = adf.columns
    # add suffix to columns and create new names for maximum columns
    max_cols = adf.add_suffix('_max').columns
    # initialize maximum columns
    adf.loc[:,max_cols] = np.nan
    # check the requested frequency
    if normalization_freq =='complete':
        adf[max_cols] = adf[nor_cols].max()
    # compute and return the normalized dataframe
    print(adf[nor_cols])
    print(adf[max_cols])
    adf[nor_cols] = adf[nor_cols]/adf[max_cols]
    # return the normalized dataframe
    return adf[nor_cols]
    
# Example
df2 = pd.DataFrame(data={'A':[20,10,30],'B':[1,2,3]})
timeseries_dataframe_normalized(df2)

Expected output:

df2 = 
        A         B
0   0.666667    0.333333
1   0.333333    0.666667
2   1.000000    1.000000

Present output:

I am surprized to get following error. However, when I compute df2/df2.max() I am getting the expected output but this function giving me error.

ValueError: Columns must be same length as key

Solution

  • Change the line to (that way you divide the dataframe with numpy ndarray):

    adf[nor_cols] = adf[nor_cols] / adf[max_cols].to_numpy()
    

    then the return value is:

              A         B
    0  0.666667  0.333333
    1  0.333333  0.666667
    2  1.000000  1.000000