Search code examples
pythonpandasdataframefeature-scaling

Plotting dataframe with different scale values in python


I have the following dataframe

df = pd.DataFrame({
'Date': [1930, 1931, 1932, 1933,1934],
'Income': [2300000, 5698907, 5976753, 6086762, 6577780],
'Age': [22, 45, 35, 40, 28],
'Weight': [0.01, 0.003, 0.04, 0.08, 0.07]
}) 

Each variable has different scale values. I want to plot the varibales on 1 graph but due to the scale difference of the varibales i can only see the income line. I plotted using

df.plot(figsize=(20,10), linewidth=5, fontsize = 20);

I decided to feature scale based on what i found online so i did the following:

import pandas as pd
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
df = scaler.fit_transform(df)

I then tried to plot the dataframe after the feature scalling and it gave the following error:

AttributeError: 'numpy.ndarray' object has no attribute 'plot'

I'm not sure where to go from here. The aim is to plot all the variables on 1 graph.


Solution

  • I believe you need create new DataFrame, because fit_transform return 2d numpy array:

    import pandas as pd
    from sklearn.preprocessing import StandardScaler
    scaler = StandardScaler()
    
    df = pd.DataFrame(scaler.fit_transform(df), columns=df.columns, index=df.index)
    df.plot(figsize=(20,10), linewidth=5, fontsize = 20)