Search code examples
pythonpandasmatplotlibattributeerror

Python: How can I fix an AttributeError in Python?


I have the code:

df_mean_woman = df_mean_woman.rename(index = {"Less than 1 year":0}, inplace = True)
df_mean_woman

And when I run it I get the error

AttributeError                            Traceback (most recent call last)
<ipython-input-136-94a5cc6acf63> in <module>
----> 1 df_woman = df_woman.rename(index = {"Less than 1 year":0},
      2                                   #"More than 50 years":int(51)},
      3                                   inplace = True)
      4 df_woman

AttributeError: 'NoneType' object has no attribute 'rename'

Although the error goes away when I simply type df_mean_woman.rename(index = {"Less than 1 year":0}, inplace = True) But I cannot simply do that because I need to call df again later. I have tried doing quite a few things to fix this, but nothing seems to work. I do not think it is because "Less than 1 year" is not spelled correctly. My main issue seems to be that when I print out df_mean_woman (before the rename) it is said that df does not exist. When I rerun Juptyr I am able to print out df but all that gets printed is 'None'.

My full code is

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_csv('data.csv') 
%matplotlib inline
df_new = df.copy()
df_new = df_new.drop(['Age1stCode','CompTotal','Respondent', 'MainBranch', 'Hobbyist', 'Age', 'CompFreq', 'Country', 'CurrencyDesc', 'CurrencySymbol', 'DatabaseDesireNextYear', 'DatabaseWorkedWith', 'DevType', 'EdLevel', 'Employment', 'Ethnicity', 'JobFactors', 'JobSat', 'JobSeek', 'LanguageDesireNextYear', 'LanguageWorkedWith', 'MiscTechDesireNextYear', 'MiscTechWorkedWith', 'NEWCollabToolsDesireNextYear', 'NEWCollabToolsWorkedWith', 'NEWDevOps', 'NEWDevOpsImpt', 'NEWEdImpt', 'NEWJobHunt', 'NEWJobHuntResearch', 'NEWLearn', 'NEWOffTopic', 'NEWOnboardGood', 'NEWOtherComms', 'NEWOvertime', 'NEWPurchaseResearch', 'NEWPurpleLink', 'NEWSOSites', 'NEWStuck', 'OpSys', 'OrgSize', 'PlatformDesireNextYear', 'PlatformWorkedWith', 'PurchaseWhat', 'Sexuality', 'SOAccount', 'SOComm', 'SOPartFreq', 'SOVisitFreq', 'SurveyEase', 'SurveyLength', 'Trans', 'UndergradMajor', 'WebframeDesireNextYear', 'WebframeWorkedWith', 'WelcomeChange', 'WorkWeekHrs', 'YearsCodePro'], axis = 'columns')
df_new = df_new.dropna()
df_new    
df_woman = df_new.drop(index=df_new[df_new['Gender'] != 'Woman'].index, inplace=True)
df_woman = df_new
df_woman = df_woman.drop(['Gender'], axis ='columns')
df_news = df_new.copy()

df_woman = df_woman.rename(index = {"Less than 1 year":int(0)},
                                  #"More than 50 years":int(51)},
                                  inplace = True)
df_woman['YearsCode'] = df_woman['YearsCode'].apply(lambda x: '{0:0>2}'.format(x))
df_mean_woman = df_woman.groupby('YearsCode')['ConvertedComp'].mean().sort_index()

df_mean_woman

Solution

  • It looks like you are excluding more columns then you are including, so it would be easier to make a list of the columns you want rather than a much longer list of the columns you want to drop.

    Overall, I would not use drop and would instead use loc for most of these operations. It is also unclear why you are trying to manipulate the index rather than the column values.

    # looks like stackoverflow survey data
    df = pd.read_csv('survey_results_public.csv')
    
    unwanted = {'Age1stCode','CompTotal','Respondent', 'MainBranch', 'Hobbyist', 'Age', 'CompFreq', 'Country', 
                'CurrencyDesc', 'CurrencySymbol', 'DatabaseDesireNextYear', 'DatabaseWorkedWith', 'DevType', 
                'EdLevel', 'Employment', 'Ethnicity', 'JobFactors', 'JobSat', 'JobSeek', 'LanguageDesireNextYear', 
                'LanguageWorkedWith', 'MiscTechDesireNextYear', 'MiscTechWorkedWith', 'NEWCollabToolsDesireNextYear', 
                'NEWCollabToolsWorkedWith', 'NEWDevOps', 'NEWDevOpsImpt', 'NEWEdImpt', 'NEWJobHunt', 'NEWJobHuntResearch', 
                'NEWLearn', 'NEWOffTopic', 'NEWOnboardGood', 'NEWOtherComms', 'NEWOvertime', 'NEWPurchaseResearch', 
                'NEWPurpleLink', 'NEWSOSites', 'NEWStuck', 'OpSys', 'OrgSize', 'PlatformDesireNextYear', 
                'PlatformWorkedWith', 'PurchaseWhat', 'Sexuality', 'SOAccount', 'SOComm', 'SOPartFreq', 'SOVisitFreq', 
                'SurveyEase', 'SurveyLength', 'Trans', 'UndergradMajor', 'WebframeDesireNextYear', 'WebframeWorkedWith', 
                'WelcomeChange', 'WorkWeekHrs', 'YearsCodePro'}
    
    # no need to copy dataframe before selecting columns
    df_new = df.loc[:, list(set(df.columns) - unwanted)]
    
    # use .loc to make df_woman
    df_woman = df_new.loc[df_new['Gender'] != 'Woman', df_new.columns.drop('Gender')]
    
    # convert strings to numeric values
    df_woman['YearsCode'] = df_woman['YearsCode'].str.replace('Less than 1 year', '0')
    df_woman['YearsCode'] = df_woman['YearsCode'].str.replace('More than 50 years', '51')
    df_woman['YearsCode'] = pd.to_numeric(df_woman['YearsCode'], errors='coerce').fillna(0).astype(int)
    
    # now groupby and analyze
    df_woman.groupby('YearsCode')['ConvertedComp'].mean()