I am struggling to remove nans. Already spent some time searching for the solution but nothing seems to work.
Below I am attaching a sample of my code. The whole notebook can be found on my GitHub here: https://github.com/jarsonX/Temp_files/blob/main/W3-Exploratory%20Data%20Analysis(1).ipynb
import pandas as pd
import seaborn as sns #not used in this sample, needed for plotting later on
import matplotlib as mpl #as above
import matplotlib.pyplot as plt #as above
import numpy as np #as above
df = pd.read_csv("https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DA0321EN-SkillsNetwork/LargeData/m2_survey_data.csv")
df.Age.describe() #dtype float64
df['Age'].isna().value_counts() #287 nans
df['Age'].dropna(how='any', inplace=True) #trying to remove nans
df['Age'].isna().value_counts() #still 287 nans
#Just for the sake of identification of rows
#I tried to print ONLY nans but could not figure out how to do it.
i = 0
for el in df.Age:
print(i, el, type(el))
i += 1
#The first nan is in the 67th row
What am I missing?
UPDATE:
I've managed to filter out nans:
i = 0
for el in df.Age:
if el != el:
print(i, el, type(el))
i += 1
You can try out the following snippet, dropna
when called in a Series doesn't respect the how
argument, since its just a single column
df.dropna(subset=["Age"], how="any", inplace=True)