Search code examples
pandasshiftstock

difference between two values in a pandas dataframe that are variable lengths apart


I am trying to automatically calculate the profit/loss from my trades. currently i have my pandas daatframe set up to return a hold column that contains 1's while the purcahse is active and a -1 once i have sold. the price column records the price of the stock while the hold time and count columns keep track of how long the trade has been held in two different ways.

what i am struggling to do is to calculate how much money i have made/lost. i need it to calculate (as a percentage)the difference ebtween the purchased price (the first non zero value ) and the sold value (the last non zero value in a series). the challenge comes from the tardes being of variable length so df.shift doesnt work.

below is a sample dataset:

thank you and if anything is unclear please ask

         Date   Hold  Price  Hold_Time   count
148  20190801     0   0.00          0       0
149  20190802     0   0.00          0       0
150  20190805     0   0.00          0       0
151  20190806     1  21.50          1       1
152  20190807     1  22.48          1       2
153  20190808     1  22.78          1       3
154  20190809     1  24.17          1       4
155  20190812     1  23.72          1       5
156  20190813    -1  23.39          0       0
157  20190814     0   0.00          0       0
158  20190815     0   0.00          0       0
159  20190816     0   0.00          0       0
160  20190819     0   0.00          0       0
161  20190820     0   0.00          0       0
162  20190821     0   0.00          0       0
163  20190822     0   0.00          0       0
164  20190823     1  24.80          1       1
165  20190826     1  24.00          1       2
166  20190827    -1  24.65          0       0
167  20190828     0      0          0       0
168  20190829     0      0          0       0

Solution

  • Thank you for providing with an easy-to-work-with dataset. Considering it is named as 'data', I propose the following solution

    import pandas as pd
    import numpy as np
    
    data = pd.read_clipboard()
    
    df = data.copy() # copy data on another dataframe
    
    # keep only rows where you bought or sell:
    df['transaction_id'] = df.Hold_Time - df.Hold_Time.shift()
    df = df.query('transaction_id!=0').dropna()
    
    # calculate profit/loss for each time you sold
    df['profit'] = np.where(df.Hold == -1, df.Price - df.Price.shift(), 0)
    
    # calculate total profit (or anything else you want, I hope it will be easy at this point) 
    TOTAL_PROFIT = df.profit.sum()