Search code examples
rpurrr

I met a bug that map_df seems not pass down an argument


My scenario involves loading data from the IMF API using the imf.data package. I need to extract multiple indices simultaneously, and their vector lengths may vary.

library(tidyverse)
library(data.table)
library(imf.data)

# for instance, one possible indice vector is like follows
  index <- c(
    # FA - Direct Investment
    "BFDA_BP6_USD", "BFDAD_BP6_USD", "BFDAE_BP6_USD",
    # FA - Portfolio Investment
    "BFPA_BP6_USD", "BFPAD_BP6_USD", "BFPAE_BP6_USD" )

I define a function to facilitate the extraction of these indices by

getFAfromIFS <- function(iso2 = "US") {
  dta_IFS <- imf.data::load_datasets("IFS")
  # iso2 <- "US"
  index <- c(
    # FA - Direct Investment
    "BFDA_BP6_USD", "BFDAD_BP6_USD", "BFDAE_BP6_USD",
    # FA - Portfolio Investment
    "BFPA_BP6_USD", "BFPAD_BP6_USD", "BFPAE_BP6_USD"
  )
  
  each_dta.byeachISO2 <- map_df(index, function(xxx) 
  {
    # xxx <- index[1]
    print(paste0("...loading data from iso2=", iso2, " & index=", xxx))
    each_account <- dta_IFS$get_series(freq = "Q", ref_area = iso2, indicator = xxx) 
    print("... check? ") # <--- I found the function stops here every time!!!
    
    each_account %>%
      setDT() %>%
      setnames(c("time", "value", "Note")) %>%
      select(-Note) %>%
      .[, `:=` (ISO2 = iso2, 
                index = xxx)]
  }) %>%
    dcast( time + ISO2 ~ index, value.var = "value")
}

data1 <- getFAfromIFS(iso2 = "US")

It gives me an error,

Error in `map()`:
ℹ In index: 1.
Caused by error:
! object 'iso2' not found
Run `rlang::last_trace()` to see where the error occurred.

I'm very confusing. The bug, I guess, is possibly from the line each_account <- dta_IFS$get_series(freq = "Q", ref_area = iso2, indicator = xxx) . But I have already defined both iso2 (from the arg of function getFAfromIFS ) and xxx (from the arg of function map_df).

What have I missed? Please help me. Thank you!


Solution

  • This doesn't really have anything to do with purrr or map. In this case, the dta_IFS$get_series function appears to have a bug in it when calling it from a function. That function is defined as

    function (freq, ref_area, indicator, start_period = NULL, end_period = NULL) 
    {
        x <- eval(as_list(match.call()))
        return(get0(x, "IFS"))
    }
    

    The problem is that eval() should be eval.parent(). This means that something like this works fine

    foo <- function() {
      dta_IFS <- imf.data::load_datasets("IFS")
      dta_IFS$get_series(freq = "Q", indicator = "BFDA_BP6_USD", ref_area = "US")  
    }
    foo()
    

    but this throws an error

    foo <- function() {
      dta_IFS <- imf.data::load_datasets("IFS")
      myindicator <- "BFDA_BP6_USD"
      dta_IFS$get_series(freq = "Q", indicator = myindicator, ref_area = "US")  
    }
    foo()
    

    Unfortunately you cannot easily fix the bug itself (the package author would have to do that), you can instead create a work around. You can change the line

    each_account <- dta_IFS$get_series(freq = "Q", ref_area = iso2, indicator = xxx) 
    

    to

    each_account <- do.call(dta_IFS$get_series, list(freq = "Q", indicator = xxx, ref_area = iso2))
    

    That should get around the error by forcing evaluation of the function parameters prior to execution.