Search code examples
pythonpandastime-seriesdata-analysis

How to add rows to a dataframe when values are recursively dependent?


I have a data frame with columns a and b

df = pd.DataFrame(data = [[3, 6], [5, 10], [9, 18], [17, 34]], columns = ["a", "b"])

The structure of this data is as follows,

if at denotes the value of column a at row t and the same for bt, then

bt = 2 * at
at = bt - 1 - 1

See how the values of a are determined by the previous values of b and the values of b is determined by a. This recursive dependency means that I can't simply use the .shift() command

We are given the value of a at row 0. How do I approach generating this data frame for a specified n rows efficiently, preferably without loops?

I attempted using loops. They're inefficient once the calculations get more complicated and as n increases. Are there better ways to generate recursively related columns?


Solution

  • Here is my take on your interesting question, for instance with 3 as the value of a at row 0 and 10 as n:

    import pandas as pd
    
    A = 3
    N = 10
    
    dfs = [pd.DataFrame(data=[[A, 2 * A]], columns=["a", "b"])]
    for _ in range(N - 1):
        dfs = dfs + [
            (dfs[-1].shift(-1, axis=1) - 1).pipe(
                lambda df_: df_.fillna(df_["a"].values[0] * 2)
            )
        ]
    
    df = pd.concat(dfs, ignore_index=True)
    

    Then:

    print(df)
    # Output
            a       b
    0     3.0     6.0
    1     5.0    10.0
    2     9.0    18.0
    3    17.0    34.0
    4    33.0    66.0
    5    65.0   130.0
    6   129.0   258.0
    7   257.0   514.0
    8   513.0  1026.0
    9  1025.0  2050.0