Search code examples
pythonpython-3.xpandasnumpyfinance

How can we find the IRR of cash flows in a Dataframe?


I can easily find the NPV if items in a dataframe using the code below. But how can I get the IRR of the same items?

import numpy_financial as npf
import pandas as pd

# Intitialise data of lists
data = [{'Month': '2020-01-01', 'Expense':1000, 'Revenue':5000, 'Building':'Stadium'}, 
       {'Month': '2020-02-01', 'Expense':3000, 'Revenue':4000, 'Building':'Stadium'},
       {'Month': '2020-03-01', 'Expense':7000, 'Revenue':5000, 'Building':'Stadium'}, 
       {'Month': '2020-04-01', 'Expense':3000, 'Revenue':4000, 'Building':'Stadium'},
       {'Month': '2020-01-01', 'Expense':5000, 'Revenue':6000, 'Building':'Casino'}, 
       {'Month': '2020-02-01', 'Expense':5000, 'Revenue':4000, 'Building':'Casino'},
       {'Month': '2020-03-01', 'Expense':5000, 'Revenue':9000, 'Building':'Casino'},
       {'Month': '2020-04-01', 'Expense':6000, 'Revenue':10000, 'Building':'Casino'}]
df = pd.DataFrame(data)
df

df.groupby("Building")["Revenue"].apply(lambda x: npf.npv(rate=0.1, values=x))

Result:

Building
Casino     24587.528174
Stadium    15773.854245

I tried to find the IRR, like this.

df.groupby("Building")["Revenue"].apply(lambda x: npf.irr(values=x))

It calculates only NAN.

Result:

Building
Casino    NaN
Stadium   NaN

Documentation: https://numpy.org/numpy-financial/latest/irr.html


Solution

  • While I am no expert on Financial Analysis, I believe this question requires more explanation and assessment than what has been presented. So, with all due respect to @KarelZ's response which in fact produces an answer for the stated data, I think from a financial analysis standpoint it is not of much value.
    As defined Internal Rate of Return (IRR) is a metric used in financial analysis to estimate the profitability of potential investments. IRR is a discount rate that makes the net present value (NPV) of all cash flows equal to zero in a discounted cash flow analysis. The inherent assumptions in this definition are (1) there exists an initial investment and (2) there is a cashflow stream resulting from the investment.

    As defined Net Present Value (NPV) is the present value of the cash flows at a specified rate of return of your project compared to your initial investment. In practical terms, it's a method of calculating your return on investment, or ROI, for a project or expenditure. While NPV doesn't necessarily imply an initial investment, it does imply that for the calculation to be useful, the true cashflow should be evaluated which implies taking into account the expenses as well as the revenue to be meaningful.

    In order to compute a valid IRR we need to incorporate the initial investment in the structures and compute the IRR based on differences between Expenses and Revenue. With this in mind, I have modified the original dataset by adding a row to each structure showing the initial investment as an Expense. See below:

    # Intitialise data of lists
    data = [{'Month': '2019-12-01', 'Expense':100000, 'Revenue':0, 'Building':'Stadium'},
           {'Month': '2020-01-01', 'Expense':1000, 'Revenue':5000, 'Building':'Stadium'}, 
           {'Month': '2020-02-01', 'Expense':3000, 'Revenue':4000, 'Building':'Stadium'},
           {'Month': '2020-03-01', 'Expense':7000, 'Revenue':5000, 'Building':'Stadium'}, 
           {'Month': '2020-04-01', 'Expense':3000, 'Revenue':4000, 'Building':'Stadium'},
           {'Month': '2019-12-01', 'Expense':150000, 'Revenue':0, 'Building':'Casino'}, 
           {'Month': '2020-01-01', 'Expense':5000, 'Revenue':6000, 'Building':'Casino'}, 
           {'Month': '2020-02-01', 'Expense':5000, 'Revenue':4000, 'Building':'Casino'},
           {'Month': '2020-03-01', 'Expense':5000, 'Revenue':9000, 'Building':'Casino'},
           {'Month': '2020-04-01', 'Expense':6000, 'Revenue':10000, 'Building':'Casino'}]
    df = pd.DataFrame(data)  
    

    This produces the dataframe shown below:

        Month   Expense Revenue Building
    0   2019-12-01  100000  0   Stadium
    1   2020-01-01  1000    5000    Stadium
    2   2020-02-01  3000    4000    Stadium
    3   2020-03-01  7000    5000    Stadium
    4   2020-04-01  3000    4000    Stadium
    5   2019-12-01  150000  0   Casino
    6   2020-01-01  5000    6000    Casino
    7   2020-02-01  5000    4000    Casino
    8   2020-03-01  5000    9000    Casino
    9   2020-04-01  6000    10000   Casino  
    

    To this dataframe I added a CashFlow Column consisting of the difference between expense and revenue as follows:

    def computeCashFlow(e, r):
        return r-e  
    
    df['CashFlow'] = df.apply(lambda row: computeCashFlow(row.Expense, row.Revenue), axis= 1)  
    

    Which results in the addition of the CashFlow shown below:

        Month   Expense Revenue Building    CashFlow
    0   2019-12-01  100000  0   Stadium -100000
    1   2020-01-01  1000    5000    Stadium 4000
    2   2020-02-01  3000    4000    Stadium 1000
    3   2020-03-01  7000    5000    Stadium -2000
    4   2020-04-01  3000    4000    Stadium 1000
    5   2019-12-01  150000  0   Casino  -150000
    6   2020-01-01  5000    6000    Casino  1000
    7   2020-02-01  5000    4000    Casino  -1000
    8   2020-03-01  5000    9000    Casino  4000
    9   2020-04-01  6000    10000   Casino  4000  
    

    Using the CashFlow Column you can then compute IRR and NPR as follows:

    df.groupby("Building")["CashFlow"].apply(lambda x: npf.npv(rate=0.1, values=x))  
    Building
    Casino    -144180.042347
    Stadium    -96356.806229
    Name: CashFlow, dtype: float64  
    
    df.groupby('Building')['CashFlow'].apply(lambda x: npf.irr(x))  
    Building
    Casino    -0.559380
    Stadium   -0.720914
    Name: CashFlow, dtype: float64  
    

    Giving realistic results for NPV and IRR taking into account the original investments