Search code examples
pandasdataframereplace

using DataFrame.replace() for replacing a string with NaN in a DataFrame.map() function returns TypeError


I realize there are working alternatives to this, I just want to understand what is going on for my own edification or anyone else who comes across this.

df_test = pd.DataFrame({'test1':['blah1','blah2','blah3'],'test2':['blah1','blah2','blah3']})

When I run the below code on this above DataFrame, I get TypeError: replace() argument 2 must be str, not float

df_test.map(lambda x: x.replace('blah1',np.nan))

What exactly prevents 'argument 2' from being np.nan from working when of course the below works no problem

df_test.replace('blah1',np.nan)

Thanks in advance


Solution

  • Since map() is applied element-wise, each element (x) is a string- when replace() is called, it is invoking Python's built-in string method, instead of Panda's.

    The error is raised because Python's built-in method requires both arguments to be strings.