I have the following table in which for every Name
I would like to know the total days of employment, and current status (Employed or Not Employed).
Date Name EmploymentType
01-1-18 A Hired
10-1-18 A Fired
11-1-18 A Hired
15-1-18 A Fired
25-2-18 A Hired
25-2-18 B Hired
05-2-18 C Hired
15-2-18 C Fired
I would like the following result:
Total Days Employed Name Current Status
15 A Employed
0 B Employed
10 C Not Employed
It would be great if I can know how to do this in Google sheets or python, both appreciated.
It's not the most elegant solution but you can play from this or understand a little the logic behind
import pandas as pd
df = pd.DataFrame({"date":["2018-01-01", "2018-01-10","2018-01-11",
"2018-01-15", "2018-02-25","2018-02-25",
"2018-02-05", "2018-02-15"],
"name":["a"]*5+["b"]+["c"]*2,
"status":['hired', "fired","hired", "fired",
"hired", "hired", "hired", "fired"]})
def fun(x):
x = x.sort_values("date")\
.reset_index(drop=True)
res =[None]*2
# this tell you the last status
res[0] = x["status"].iloc[-1]
# here we count days between any hired and fired
res[1] = x["date"].diff().dt.days.values[1::2].sum()
return(res)
df["date"] = df["date"].astype("M8[us]")
out = df.groupby("name").apply(lambda x: fun(x)).reset_index()
out[["status", "days"]] = out[0].apply(pd.Series)
del out[0]
out
name status days
0 a hired 13.0
1 b hired 0.0
2 c fired 10.0
I will consider to add the today date in case the employee is still working.