Search code examples
rtidyverse

In R , there are `actual` and `budget` values,how to add new variable and calculate the variable values


In variable type ,there are actual and budget values,how to add new variable and calculate the variable value ? Current code can work, but a little bording. Anyone can help? Thanks!

 ori_data <- data.frame(
    category=c("A","A","A","B","B","B"),
    year=c(2021,2022,2022,2021,2022,2022),
    type=c("actual","actual","budget","actual","actual","budget"),
    sales=c(100,120,130,70,80,90),
    profit=c(3.7,5.52,5.33,2.73,3.92,3.69)
    )

Add sales inc%

ori_data$sales_inc_or_budget_acheved[category=='A'&year=='2022'&type=='actual'] <- 
  ori_data$sales[category=='A'&year=='2022'&type=='actual']/
  ori_data$sales[category=='A'&year=='2021'&type=='actual']-1

Add budget acheved%

ori_data$sales_inc_or_budget_acheved[category=='A'&year=='2022'&type=='budget'] <- 
  ori_data$sales[category=='A'&year=='2022'&type=='actual']/
  ori_data$sales[category=='A'&year=='2022'&type=='budget']

enter image description here


Solution

  • Using a group_by and an if_elseyou could do:

    library(dplyr)
    
    ori_data |> 
      group_by(category) |> 
      arrange(category, type, year) |> 
      mutate(sales_inc_or_budget_achieved = if_else(type == "actual", 
                                                    sales / lag(sales) - 1, 
                                                    lag(sales) / sales)) |> 
      ungroup()
    #> # A tibble: 6 × 6
    #>   category  year type   sales profit sales_inc_or_budget_achieved
    #>   <chr>    <dbl> <chr>  <dbl>  <dbl>                        <dbl>
    #> 1 A         2021 actual   100   3.7                        NA    
    #> 2 A         2022 actual   120   5.52                        0.2  
    #> 3 A         2022 budget   130   5.33                        0.923
    #> 4 B         2021 actual    70   2.73                       NA    
    #> 5 B         2022 actual    80   3.92                        0.143
    #> 6 B         2022 budget    90   3.69                        0.889
    

    And using across you could do the same for both sales and profit:

    ori_data |> 
      group_by(category) |> 
      arrange(category, type, year) |> 
      mutate(across(c(sales, profit), ~ if_else(type == "actual", 
                                                .x / lag(.x) - 1, 
                                                lag(.x) / .x), 
                    .names = "{.col}_inc_or_budget_achieved")) |> 
      ungroup()
    #> # A tibble: 6 × 7
    #>   category  year type   sales profit sales_inc_or_budget_achie… profit_inc_or_b…
    #>   <chr>    <dbl> <chr>  <dbl>  <dbl>                      <dbl>            <dbl>
    #> 1 A         2021 actual   100   3.7                      NA               NA    
    #> 2 A         2022 actual   120   5.52                      0.2              0.492
    #> 3 A         2022 budget   130   5.33                      0.923            1.04 
    #> 4 B         2021 actual    70   2.73                     NA               NA    
    #> 5 B         2022 actual    80   3.92                      0.143            0.436
    #> 6 B         2022 budget    90   3.69                      0.889            1.06