Search code examples
rmathreportdata-analysisfinance

Calculating Financial Delays Using R


I am currently working on a project to calculate losses for certain contracts, and I could really use your help with some challenges I'm facing.

I have successfully calculated losses for each individual contract. The Start Date indicates when the contract should have begun, and the Creation Date is when the payment was received and when the contract officially began.

The issue I’m encountering is related to calculating the total financial delays per year. Some contracts have a Start Date in 2023 but a Creation Date in 2024. My goal is to allocate the total amount of the payments delayed to the appropriate year and month so that I can calculate the number accurately for both 2023 and 2024. If a contract separates between the two years, then the program should calculate how much from that contract was lost in 2023 and how much in 2024 in each month consequently.

I’ve filtered the data to only include contracts starting from January 1st, 2023. My table includes the following columns:

Delay_Days: Calculates the total number of days delayed.

Excess_Delay: Calculates the number of days beyond the first 31 days (since delays within 31 days are not counted for losses).

I’ve been struggling with this for the past three days and feel like I’m missing something. Could you please advise on the best R code to achieve this?

Additionally, any suggestions or advice for improving the report would be greatly appreciated. If there are any aspects you think could be improved, added, or analyzed from a data science or finance perspective, I would love to hear your thoughts.

Thank you so much for your time! I’ve also attached a snapshot of my table for reference (in reality there are 190 contracts that have delay for more than 31 days):

enter image description here


Solution

  • first off you might consider reading up on how to post a good question and how to produce a reprex. This makes it easier for others to help you quickly.

    On to your question. You could calculate the loss per year using a function that incorporates conditionals. Firstly, I created some data using what you provided in the attached image. The data is not identical to what you provided because in the table you provide there are no observations where starting_date and creation_date are in separate years. To show how the loss can be calculated per year, I therefore made an entries where creation_date and starting_date are either both in 2023; both in 2024; or one in 2023 and one in 2024.

    library(tidyverse)
    
    df <- data.frame(contract_so_id = 1:3,
                     start_date = as.Date(c("01/01/23", "01/05/23", "01/01/24"), format = "%d/%m/%y"),
                     creation_date = as.Date(c("17/04/23", "17/05/24", "17/05/24"), format = "%d/%m/%y"),
                     daily_loss = c(464, 211, 179))
    
    

    Next you can create a function that calculates the loss per year. To do this, the function contains conditions regarding the starting_date, creation_date and the year for which you want to calculate the loss. Because you indicate that your data is spread between 2023 and 2024, these are the only years I've used as conditions.

    loss_func <- function(yr, sd, cd, dl){
    
      end.2023 <- as.Date("31/12/23", format = "%d/%m/%y")
       
      if(yr == 2023){
        if(year(cd) == 2023 && year(sd) == 2023){
          loss <-  (as.numeric(difftime(cd, sd, units = "days"))-31)*dl
          } else if(year(cd) == 2024 && year(sd) == 2023){
            loss <-  (as.numeric(difftime(end.2023, sd, units = "days"))-31)*dl
            } else{
              loss <- 0
              }
        } 
      
      if(yr == 2024){
        if(year(cd) == 2024 && year(sd) == 2024){
          loss <- (as.numeric(difftime(cd, sd, units = "days"))-31)*dl
          } else if(year(cd) == 2024 && year(sd) == 2023){
            loss <- as.numeric(difftime(cd, end.2023, units = "days"))*dl
            } else{
              loss <- 0
            }
        }
      
      return(loss)
    }
    

    The function can then be implemented to calculate the loss for the years 2023 and 2024.

    df %>%
      mutate(delay_days = as.numeric(difftime(creation_date, start_date, units = "days")),
             excess_delay = as.numeric(difftime(creation_date, start_date, units = "days"))-31,
             total_loss = excess_delay*daily_loss) %>%
      rowwise() %>%
      mutate(loss_2023 = loss_func(2023, start_date, creation_date, daily_loss),
             loss_2024 = loss_func(2024, start_date, creation_date, daily_loss)) %>%
      ungroup() 
    

    Hope this helps and merry Christmas!