python for-loop simulation agent-based-modeling

Attribute change with variable number of time steps

I would like to simulate individual changes in growth and mortality for a variable number of days. My dataframe is formatted as follows...

    import pandas as pd

    data = {'unique_id':  ['2', '4', '5', '13'],
            'length': ['27.7', '30.2', '25.4', '29.1'],
            'no_fish': ['3195', '1894', '8', '2774'],
            'days_left': ['253', '253', '254', '256'],
            'growth': ['0.3898', '0.3414', '0.4080', '0.3839']
           }

    df = pd.DataFrame(data)

    print(df)

      unique_id length no_fish days_left  growth
    0         2   27.7    3195       253  0.3898
    1         4   30.2    1894       253  0.3414
    2         5   25.4       8       254  0.4080
    3        13   29.1    2774       256  0.3839

Ideally, I would like the initial length (i.e., length) to increase by the daily growth rate (i.e., growth) for each of the days remaining in the year (i.e., days_left).

    df['final'] = df['length'] + (df['days_left'] * df['growth']

However, I would also like to update the number of fish that each individual represents (i.e., no_fish) on a daily basis using a size-specific equation. I'm fairly new to python so I initially thought to use a for-loop (I'm not sure if there is another, more efficient way). My code is as follows:

# keep track of run time - START
start_time = time.perf_counter()

df['z'] = 0.0
for indx in range(len(df)): 
    count = 1
    while count <= int(df.days_to_forecast[indx]):
   
        # (1) update individual length
        df.lgth[indx] = df.lgth[indx] + df.linearGR[indx]
    
        # (2) estimate daily size-specific mortality 
        if df.lgth[indx] > 50.0:
            df.z[indx] = 0.01
        else:
            if df.lgth[indx] <= 50.0:
                df.z[indx] = 0.052857-((0.03/35)*df.lgth[indx])
            elif df.lgth[indx] < 15.0:
                df.z[indx] = 0.728*math.exp(-0.1892*df.lgth[indx])
    
        df['no_fish'].round(decimals = 0)
        if df.no_fish[indx] < 1.0:
            df.no_fish[indx] = 0.0
        elif df.no_fish[indx] >= 1.0:
            df.no_fish[indx] = df.no_fish[indx]*math.exp(-(df.z[indx]))
    
        # (3) reduce no. of days left in forecast by 1
        count = count + 1

# keep track of run time - END
total_elapsed_time = round(time.perf_counter() - start_time, 2)
print("Forecast iteration completed in {} seconds".format(total_elapsed_time))

The above code now works correctly, but it is still far to inefficient to run for 40,000 individuals each for 200+ days.

I would really appreciate any advice on how to modify the following code to make it pythonic.

Thanks

Solution

As I said in my comment, a preferable alternative to for loops in this setting is using vector operations. For instance, running your code:

import pandas as pd
import time
import math
import numpy as np

data = {'unique_id':  [2, 4, 5, 13],
        'length': [27.7, 30.2, 25.4, 29.1],
        'no_fish': [3195, 1894, 8, 2774],
        'days_left': [253, 253, 254, 256],
        'growth': [0.3898, 0.3414, 0.4080, 0.3839]
       }

df = pd.DataFrame(data)

print(df)

# keep track of run time - START
start_time = time.perf_counter()

df['z'] = 0.0
for indx in range(len(df)): 
    count = 1
    while count <= int(df.days_left[indx]):
   
        # (1) update individual length
        df.length[indx] = df.length[indx] + df.growth[indx]
    
        # (2) estimate daily size-specific mortality 
        if df.length[indx] > 50.0:
            df.z[indx] = 0.01
        else:
            if df.length[indx] <= 50.0:
                df.z[indx] = 0.052857-((0.03/35)*df.length[indx])
            elif df.length[indx] < 15.0:
                df.z[indx] = 0.728*math.exp(-0.1892*df.length[indx])
    
        df['no_fish'].round(decimals = 0)
        if df.no_fish[indx] < 1.0:
            df.no_fish[indx] = 0.0
        elif df.no_fish[indx] >= 1.0:
            df.no_fish[indx] = df.no_fish[indx]*math.exp(-(df.z[indx]))
    
        # (3) reduce no. of days left in forecast by 1
        count = count + 1

# keep track of run time - END
total_elapsed_time = round(time.perf_counter() - start_time, 2)
print("Forecast iteration completed in {} seconds".format(total_elapsed_time))
print(df)

with output:

   unique_id  length  no_fish  days_left  growth
0          2    27.7     3195        253  0.3898
1          4    30.2     1894        253  0.3414
2          5    25.4        8        254  0.4080
3         13    29.1     2774        256  0.3839
Forecast iteration completed in 31.75 seconds
   unique_id    length     no_fish  days_left  growth     z
0          2  126.3194  148.729190        253  0.3898  0.01
1          4  116.5742   93.018465        253  0.3414  0.01
2          5  129.0320    0.000000        254  0.4080  0.01
3         13  127.3784  132.864757        256  0.3839  0.01

Now with vector operations, you could do something like:

# keep track of run time - START
start_time = time.perf_counter()
df['z'] = 0.0
for day in range(1, df.days_left.max() + 1):
    update = day <= df['days_left']
    # (1) update individual length
    df[update]['length'] = df[update]['length'] + df[update]['growth']

    # (2) estimate daily size-specific mortality
    df[update]['z'] = np.where( df[update]['length'] > 50.0, 0.01, 0.052857-( ( 0.03 / 35)*df[update]['length'] ) )
    df[update]['z'] = np.where( df[update]['length'] < 15.0, 0.728 * np.exp(-0.1892*df[update]['length'] ), df[update]['z'] )
                        
                    
                            
    df[update]['no_fish'].round(decimals = 0)
    df[update]['no_fish'] = np.where(df[update]['no_fish'] < 1.0, 0.0, df[update]['no_fish'] * np.exp(-(df[update]['z'])))                          
# keep track of run time - END
total_elapsed_time = round(time.perf_counter() - start_time, 2)
print("Forecast iteration completed in {} seconds".format(total_elapsed_time))
print(df)

with output

Forecast iteration completed in 1.32 seconds
   unique_id    length     no_fish  days_left  growth    z
0          2  126.3194  148.729190        253  0.3898  0.0
1          4  116.5742   93.018465        253  0.3414  0.0
2          5  129.0320    0.000000        254  0.4080  0.0
3         13  127.3784  132.864757        256  0.3839  0.0