Today I have confronted some challenges.
This is an example dataset:
example = {
"a": ['1/1/1954 14:14','2/14/2001 2:00' , '2/15/2002 12:00'],
"b": [1936,1996,1960],
}
#load into df:
example = pd.DataFrame(example)
print(example)
What I was trying to do is:
example['c'] = example['a'] - example['b']
However, I got the issue:
unsupported operand type(s) for -: 'str' and 'int'
I tried to convert the string to the integer, but it did not work.
Could you please recommend me some package or a method to deal with this subtraction? I have heard about datetime, but I am not sure how to set the dates from column "a" accordingly.
Thank you in advance!
Convert values to datetimes and extract years:
y = pd.to_datetime(example['a']).dt.year
example['c'] = y - example['b']
Or extract integers with length 4 between /
and space:
y = example['a'].str.extract(r'/(\d{4})\s+', expand=False).astype(int)
example['c'] = y - example['b']