I have a data of bond market like this:
Id row Date BuyPrice SellPrice Time
1 1 2017-10-30 94520 0 9:00:00
1 2 2017-10-30 94538 0 9:00:00
1 3 2017-10-30 94609 0 9:00:00
1 4 2017-10-30 94615 0 9:00:00
1 5 2017-10-30 94617 0 9:00:00
1 1 2017-09-20 99100 99159 9:00:10
1 2 2017-09-20 99102 99058 9:00:11
1 3 2017-09-20 99103 99057 9:00:12
1 4 2017-09-20 99104 99056 9:00:10
1 5 2017-09-20 99105 99055 9:00:10
1 1 2017-09-20 98100 99190 9:01:10
1 2 2017-09-20 98099 99091 9:01:10
1 3 2017-09-20 98098 99092 9:01:10
1 4 2017-09-20 98097 99093 9:01:10
1 5 2017-09-20 98096 99094 9:01:10
2 1 2010-11-01 99890 100000 10:00:02
2 2 2010-11-01 99899 100000 10:00:02
2 3 2010-11-01 99901 99899 9:00:02
2 4 2010-11-01 99920 99850 10:00:02
2 5 2010-11-01 99933 99848 10:00:23
Step 1:
I want to calculate the spread(=SellPrice - BuyPrice) for row number one for each id for each day and exclude zeros if there is a zero in BuyPrice or SellPrice(Report nan for this kind of data), data in this step should be like this:
id row Date BuyPrice SellPrice Spread
1 1 2017-10-30 94520 0 NaN
1 1 2017-09-20 99100 99159 59
1 1 2017-09-20 98100 99190 190
2 1 2010-11-01 99890 100000 110
Step 2:
Now i want to calculate average of Spread for each day for each id and give index respect to date
At last data should be like this:
Id Date avg.spread(average of spread for each day) index
1 2017-10-30 NaN 1
1 2017-09-20 124.5(=(59+190)/2) 2
2 2010-11-01 110 1
I've tried my best to understand what you want, although you haven't explicitly mentioned it, I think you want to groupby
on Id
, row
, and date
.
g = df.assign(diff=df.SellPrice.sub(df.BuyPrice))\
.groupby(['Id', 'row', 'Date']).diff.mean()
v = g.groupby(level=[0, 1]).cumcount().add(1).values
df = g.reset_index().assign(index=v)
df
Id row Date diff index
0 1 1 2017-09-20 574.5 1
1 1 1 2017-10-30 NaN 2
2 1 2 2017-09-20 474.0 1
3 1 2 2017-10-30 NaN 2
4 1 3 2017-09-20 474.0 1
5 1 3 2017-10-30 NaN 2
6 1 4 2017-09-20 474.0 1
7 1 4 2017-10-30 NaN 2
8 1 5 2017-09-20 474.0 1
9 1 5 2017-10-30 NaN 2
10 2 1 2010-11-01 110.0 1
11 2 2 2010-11-01 101.0 1
12 2 3 2010-11-01 -2.0 1
13 2 4 2010-11-01 -70.0 1
14 2 5 2010-11-01 -85.0 1