Search code examples
rdata-manipulationdata-mininggeneric-derivation

Year on year growth rates for multiple columns basis unique ID & years in rows


I have a data set where I have a unique proposal ID, application year & financial statement year. One proposal ID shall have one application year(t) & could have t-1 &(or) t-2 financial year statements. I have multiple columns for debt, equity, networth etc & want to have two columns for YOY growth -F1 & YOY growth-2.

dataset :

Proposal ID Application Year Financial statement year Net sales
P1          2019             2019                     100
P1          2019             2018                     120
P1          2019             2017                     130 

Now basis each proposal ID I need additional columns on growth rates between financial statement years against my application year

desired output :

Proposal ID Application Year Financial statement year Net sales YOY - netsales-g1
P1          2019             2019                     100             (100-120)/120...
P1          2019             2018                     120
P1          2019             2017                     130 

this same step I need to do for all columns I have.

What I want is a function -- for each proposal ID it estimates the YOY growth & take out the latest application date as the final row with columns as YOY growth for all numeric variables in dataset

Thank you in advance for the help! :)


Solution

  • I am not sure but is it what you need?

    library(dplyr)
    library(tidyverse)
    data %>% arrange(Financial_Statement_Year) %>%
      mutate(Growth_Difference = Net_Sales - lag(Net_Sales)) %>%
      mutate(Growth_Rate = (Growth_Difference / Net_Sales) * 100)
    
    Proposal_ID Application_Year Financial_Statement_Year Net_Sales Growth_Difference Growth_Rate
    P3 2019 2017 130 NA NA
    P2 2019 2018 120 -10 -8.333
    P1 2019 2019 100 -20 -20.000