I'm trying to clean up the below dataframe so that there is only one value for each date. The data changes daily based on the date. So if today was 01/01/2020 the data would show the following;
Value Type
01/01/2010 38.3 Forecast
01/01/2020 31.85 Actual
01/01/2020 6.45 Delta
02/01/2010 31.08 Actual
03/01/2020 29 Forecast
04/01/2020 23.4 Forecast
05/01/2020 24.5 Forecast
06/01/2020 19.4 Forecast
07/01/2020 21.1 Forecast
08/01/2020 22.3 Forecast
09/01/2020 25.6 Forecast
I need to try and clean this dataframe so that if a date has an 'actual' value that value is used and if not than the forecasted value is used.
I have been using the below to pull out the forecast but than I am missing the accuracy of having the 'actual' value where it is available and then this would also exclude D+1 as there is no forecasted value.
sel = ['Forecast']
df = df.loc[df['Type'].isin(sel)]
The end results would go something like;
Value Type
01/01/2020 31.85 Actual
02/01/2010 31.08 Actual
03/01/2020 29 Forecast
04/01/2020 23.4 Forecast
05/01/2020 24.5 Forecast
06/01/2020 19.4 Forecast
07/01/2020 21.1 Forecast
08/01/2020 22.3 Forecast
09/01/2020 25.6 Forecast
Any help much appreciated!
Perhaps this is not the best way, given as I'm not overly familiar on pandas groupby transform/aggregate but this works for me.
a = df.groupby("Date").apply(lambda x: x[x["Type"]=="Actual"] if (x["Type"]=="Actual").any() else x[x["Type"]=="Forecast"]).reset_index(drop=True)