So I have some Trading data, and ultimately I would like to find 1 week return 1 month, 3 month and so on. To do this I think it is easiest if I can get a column with days existed for each asset.
Data is simplified to this:
Id | Date | Price |
---|---|---|
101 | 2014-09-14 | 0.7 |
101 | 2014-09-15 | 0.8 |
101 | 2014-09-16 | 0.9 |
101 | 2014-09-17 | 0.8 |
103 | 2014-10-03 | 1,6 |
103 | 2014-10-04 | 1.7 |
103 | 2014-10-05 | 1.7 |
103 | 2014-10-07 | 1.9 |
107 | 2015-10-17 | 24 |
107 | 2015-10-18 | 21 |
107 | 2015-10-23 | 22 |
I would then like an new column named "Days listed"
Id | Date | Price | Days listed |
---|---|---|---|
101 | 2014-09-14 | 0.7 | 1 |
101 | 2014-09-15 | 0.8 | 2 |
101 | 2014-09-16 | 0.9 | 3 |
101 | 2014-09-17 | 0.8 | 4 |
103 | 2014-10-03 | 1,6 | 1 |
103 | 2014-10-04 | 1.7 | 2 |
103 | 2014-10-05 | 1.7 | 3 |
103 | 2014-10-07 | 1.9 | 5 |
107 | 2015-10-17 | 24 | 1 |
107 | 2015-10-18 | 21 | 2 |
107 | 2015-10-23 | 22 | 7 |
I think it needs to be grouped by id, but other than that I am out of ideas.
use pd.groupby to get a earliest date for each group, which is then subtracted by the dates to get the days-listed
df['Date']=pd.to_datetime(df['Date'])
df['DaysListed']=(df['Date'] - df.groupby('Id')['Date'].transform('min')).dt.days+1
df
Id Date Price DaysListed
0 101 2014-09-14 0.7 1
1 101 2014-09-15 0.8 2
2 101 2014-09-16 0.9 3
3 101 2014-09-17 0.8 4
4 103 2014-10-03 1.6 1
5 103 2014-10-04 1.7 2
6 103 2014-10-05 1.7 3
7 103 2014-10-07 1.9 5
8 107 2015-10-17 24 1
9 107 2015-10-18 21 2
10 107 2015-10-23 22 7