Search code examples
rfunctionerror-handlingdplyrintegral

Longer object length is not a multiple of shorter object length while using mutate


I am trying to calculate the value of the function EX_A0 for each row of df and add it as a new column but i get "longer object length is not a multiple of shorter object length" error. When i filter out a row and do it for just one row, there is no error. Result of EX_A0 for both rows are numeric and has just one dimension. I don't understand why i get this error. I would appreciate your help. Here is my code:

EQ_A0 <- function(S_a, lambda_a, c){
  integrate(integrand2, 0, 30, S_a, lambda_a, c, subdivisions=2000, rel.tol=.Machine$double.eps^.05)$value
}

integrand2 <- function(tau, S_a, lambda_a, c){
  exp(log(tau)+h_A0(tau, S_a, lambda_a, c))
}
h_A0 <- function(tau, S_a, lambda_a, c){
  dgamma(tau, shape=S_a, scale = lambda_a*c, log = TRUE) - pgamma(30, shape=S_a, scale = lambda_a*c, lower.tail = TRUE, log.p=TRUE)
}

df <- data.frame(cc=c(0.06329820, 0.05141647), ya=c(31, 256), Sa=c(31,256), yb=c(2865, 742), Sb=c(2993, 1348))

df %>% 
  mutate(asd=EQ_A0(Sa, 350, cc))


The following worked but i still don't understand why mutate does not work.

mapply(EQ_A0, df$Sa, lambda_a, df$cc)
cbind(df,f = mapply(EQ_A0, df$Sa, 350, df$cc) )

Solution

  • The problem is that EQ_A0 is not a vectorized function for the parameters S_a and cc. The warning (not an error)

    Warning message:
    In dgamma(tau, shape = S_a, scale = lambda_a * c, log = TRUE) -  :
      longer object length is not a multiple of shorter object length
    

    is raised inside h_A0 and related to the fact that S_a and cc are length 2-vectors instead of scalars (You can check this by adding a browser() statement).

    As I already mentioned in my comment the result of dgamma(tau, shape=S_a, scale = lambda_a*c, log = TRUE) is a vector of length 21, while the result of pgamma(30, shape=S_a, scale = lambda_a*c, lower.tail = TRUE, log.p=TRUE) is a vector of length 2. Hence, when substracting the last from the first R raises the warning as 21 (the length of the longer object) is not a multiple of 2 (the length of the shorter object). R still performs the computation but gives you a false result.

    Also, this is not related to mutate, which you can check using EQ_A0(df$Sa, 350, df$cc) (This is what you are trying to do with mutate).

    To solve this issue you have loop over the rows of parameters in your dataframe with map2_dbl (the equivalent to mapply(EQ_A0, df$Sa, lambda_a, df$cc)) or using rowwise:

    library(dplyr)
    library(purrr)
    
    EQ_A0 <- function(S_a, lambda_a, c){
      integrate(integrand2, 0, 30, S_a, lambda_a, c, subdivisions=2000, rel.tol=.Machine$double.eps^.05)$value
    }
    
    integrand2 <- function(tau, S_a, lambda_a, c){
      exp(log(tau)+h_A0(tau, S_a, lambda_a, c))
    }
    
    h_A0 <- function(tau, S_a, lambda_a, c){
      #browser()
      dgamma(tau, shape=S_a, scale = lambda_a*c, log = TRUE) - pgamma(30, shape=S_a, scale = lambda_a*c, lower.tail = TRUE, log.p=TRUE)
    }
    
    df <- data.frame(cc=c(0.06329820, 0.05141647), ya=c(31, 256), Sa=c(31,256), yb=c(2865, 742), Sb=c(2993, 1348))
    
    # Solution 1: use map2_dbl to loop over parameters
    df %>% 
      mutate(asd = map2_dbl(Sa, cc, ~ EQ_A0(.x, 350, .y)))
    #>           cc  ya  Sa   yb   Sb      asd
    #> 1 0.06329820  31  31 2865 2993 29.02379
    #> 2 0.05141647 256 256  742 1348 29.88251
    
    # Solution 1: use rowwise to loop over parameters
    df %>% 
      rowwise() %>% 
      mutate(asd=EQ_A0(Sa, 350, cc)) %>% 
      ungroup()
    #> # A tibble: 2 x 6
    #>       cc    ya    Sa    yb    Sb   asd
    #>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    #> 1 0.0633    31    31  2865  2993  29.0
    #> 2 0.0514   256   256   742  1348  29.9
    

    Created on 2020-04-04 by the reprex package (v0.3.0)