Search code examples
pythonpandasloopsfor-loopback-testing

How to create a for loop in python that backtests a trading strategy against available data


I am trying to write an algorithm in Python that backtests a strategy based on available data. The flow should be:

  1. Start with x amount of dollars
  2. If condition_1 is satisfied, then buy x/unit_price. unit_price is the trading price of the stock.
  3. If condition_2 is satisfied, then sell unitsunit_price*. Units is the total number of units bought in step 2.

For now, I want it to alternate as buy-sell-buy-sell... in that order.

This is the basic idea:

Date Price Pct_Change
1/1/21 1600 1.5 %.
2/1/21 1680 -1.7 %.

Here is the actual data frame:

DataFrame

So far I have tried defining two functions buy and sell and incorporating them into a for loop.

dollars = 1000
units = 0

def buy(dollars,unit_price):
    global units
    units=dollars/unit_price
    dollars = 0
def sell(units,unit_price):
    global dollars
    dollars=units*unit_price
    units=0
for pct_change in df['ETH-Pct-Change']:
    if pct_change <=-1.6819:
        buy(dollars,df['ETH-Close'])
    elif pct_change >= 1.4379:
        sell(units,df['ETH-Close'])
    else:
        pass

I am expecting it to execute a buy operation in the first instance, then set the number of units to however many units were bought, next it will sell and set the dollar value to the dollar value of the total units sold and next buy with the new dollar amount and so on..

At the end of this exercise, I'd like to be able to see a total dollar value or total number of units bought.

What I get instead is a this:

Input:

dollars

Output:

Date
2021-11-30    0.0
2021-12-01    0.0
2021-12-02    0.0
2021-12-03    0.0
2021-12-04    0.0
             ... 
2022-11-25    0.0
2022-11-26    0.0
2022-11-27    0.0
2022-11-28    0.0
2022-11-29    0.0
Name: ETH-Close, Length: 365, dtype: float64

Input:

units

Output:


[100]:
Date
2021-11-30    0.0
2021-12-01    0.0
2021-12-02    0.0
2021-12-03    0.0
2021-12-04    0.0
             ... 
2022-11-25    0.0
2022-11-26    0.0
2022-11-27    0.0
2022-11-28    0.0
2022-11-29    0.0
Name: ETH-Close, Length: 365, dtype: float64

What am I doing wrong? Hopefully someone can help. Sorry if some things are obvious. I'm new and attempting to understand how it all works.


Solution

  • Global variables are not needed. dollars, units are at the top so they are accessible. You are passing a column to the function: df['ETH-Close'], but you need one row. A loop is made that iterates over each rows. The price of the instrument on the selected row and the integer index of the line are passed to the function.

    To access a specific value, explicit indexing loc is used, in which the index is on the left, the column name is on the right.

    The BUY, SELL switches are also used (this is done so that there are no consecutive purchases or sales. That is, bought is waiting for a sell signal, sold is waiting for a buy signal). On each transaction, the row index, signal type, units, unit_price, dollars are printed.

    Carefully check this is what you need?

    import pandas as pd
    
    dollars = 1000
    units = 0
    
    BUY = 0
    SELL = 0
    
    
    def buy(unit_price, index):
        print('index', index, 'buy', 'units', dollars / unit_price, 'unit_price', unit_price, 'dollars', dollars)
        return dollars / unit_price
    
    
    def sell(unit_price, index):
        print('index', index, 'sell', 'units', units, 'unit_price', unit_price, 'dollars', units * unit_price)
        return units * unit_price
    
    
    for i in range(len(df)):
        if BUY == 0 and df.loc[i, 'ETH-Pct-Change'] <= -1.6819:
            units = buy(df.loc[i, 'ETH-Close'], i)
            SELL = 0
            BUY = 1
    
        if SELL == 0 and df.loc[i, 'ETH-Pct-Change'] >= 1.4379:
            dollars = sell(df.loc[i, 'ETH-Close'], i)
            SELL = 1
            BUY = 0
    
    
    print(dollars)