I just wrote this function to calculated the age's person based in two columns in a Python DataFrame. Unfortunately, if a use the return
the function return the same value for all rows, but if I use the print
statement the function gives me the right values.
Here is the code:
def calc_age(dataset):
index = dataset.index
for element in index:
year_nasc = train['DT_NASCIMENTO_BENEFICIARIO'][element][6:]
year_insc = train['ANO_CONCESSAO_BOLSA'][element]
age = int(year_insc) - int(year_nasc)
print ('Age: ', age)
#return age
train['DT_NASCIMENTO_BENEFICIARIO'] = 03-02-1987
train['ANO_CONCESSAO_BOLSA'] = 2009
What am I doing wrong?!
If what you want is to subtract the year of DT_NASCIMENTO_BENEFICIARIO
from ANO_CONCESSAO_BOLSA
, and df
is your DataFrame:
# cast to datetime
df["DT_NASCIMENTO_BENEFICIARIO"] = pd.to_datetime(df["DT_NASCIMENTO_BENEFICIARIO"])
df["age"] = df["ANO_CONCESSAO_BOLSA"] - df["DT_NASCIMENTO_BENEFICIARIO"].dt.year
# print the result, or do something else with it:
print(df["age"])