Search code examples
pythonpandasdataframeattributeerrornonetype

df.apply(): AttributeError: 'NoneType' object has no attribute 'split' in Python


I am encountering an AttributeError while running the following commands in Python:

start = datetime.now()

df_no_dup["tag_count"] = df_no_dup["Tags"].apply(lambda text: len(text.split(" ")))
# adding a new feature number of tags per question
print("Time taken to run this cell :", datetime.now() - start)
df_no_dup.head()

The error message is as follows:

AttributeError: 'NoneType' object has no attribute 'split'

I am trying to calculate the number of tags per question using the apply function, but it seems like there might be instances where the "Tags" column contains NoneType objects. How can I modify the code to handle such cases and avoid the AttributeError?

Any assistance in resolving this issue would be appreciated.


Solution

  • If you are sure that the values you are dealing with are all strings then you can make a judgment

    from datetime import datetime
    
    
    start = datetime.now()
    
    df_no_dup["tag_count"] = df_no_dup["Tags"].apply(lambda text: len(text.split(" ") if text else 0))
    # adding a new feature number of tags per question
    print("Time taken to run this cell :", datetime.now() - start)
    df_no_dup.head()