Search code examples
rimportpackageroxygen2

R Package: how "import" works when my exported function does not call explicitly a function from other packages, but a subroutine does


I am developing my first R package and there is something that it is not clear to me about Imports in the DESCRIPTION file. I went through quite some guides that explain package structure but I do not find an answer to my question, so here is my situation.

  • I define a function f that I will export, so its definition will have the proper @export roxygen comment on top.
  • now, my function f calls a subroutine hidden, that I do not want to export. Function hidden uses other packages too, say package X.

Because the call to X is inside function hidden, there is no tag @import X in my function f. Thus, I added package X to the Imports in my DESCRIPTION file, hoping to specify the relevant dependency there.

When I use devtools::document(), however, the generated NAMESPACE does not contain an entry for X. I can see why that happens: the parser just does not find the flag in the roxygen comment for f, and at runtime a call to f crashes because X is missing.

Now, I can probably fix everything by specifying X in the import of f. But why is the mechanism this tricky? Or, similarly, why my imports in DESCRIPTION do not match the ones in NAMESPACE?


Solution

  • My understanding is that there are three "correct" ways to do the import. By "correct," I mean that they will pass CRAN checks and function properly. Which option you choose is a matter of balancing various advantages and is largely subjective.

    I'll review these options below using the terminology

    • primary_function the function in your package that you wish to export
    • hidden the unexported function in your package used by primary_function
    • thirdpartypkg::blackbox, blackbox is an exported function from the thirdpartypkg package.

    Option 1 (no direct import / explicit function call)

    I think this is the most common approach. thirdpartypkg is declared in the DESCRIPTION file, but nothing is imported from thirdpartypkg in the NAMESPACE file. In this option, it is necessary to use the thirdpartypkg::blackbox construct to get the desired behavior.

    # DESCRIPTION
    
    Imports: thirdpartypkg
    
    # NAMESPACE
    export(primary_function)
    
    
    #' @name primary_function
    #' @export
    
    primary_function <- function(x, y, z){
      # do something here
      hidden(a = y, b = x, z = c)
    }
    
    # Unexported function
    #' @name hidden
    
    hidden <- function(a, b, c){
      # do something here
    
      thirdpartypkg::blackbox(a, c)
    }
    

    Option 2 (direct import / no explicit function call)

    In this option, you directly import the blackbox function. Having done so, it is no longer necessary to use thirdpartypkg::blackbox; you may simply call blackbox as if it were a part of your package. (Technically it is, you imported it to the namespace, so there's no need to reach to another namespace to get it)

    # DESCRIPTION
    
    Imports: thirdpartypkg
    
    # NAMESPACE
    export(primary_function)
    importFrom(thirdpartypkg, blackbox)
    
    
    #' @name primary_function
    #' @export
    
    primary_function <- function(x, y, z){
      # do something here
      hidden(a = y, b = x, z = c)
    }
    
    # Unexported function
    #' @name hidden
    #' @importFrom thirdpartypkg blackbox
    
    hidden <- function(a, b, c){
      # do something here
    
      # I CAN USE blackbox HERE AS IF IT WERE PART OF MY PACKAGE
      blackbox(a, c)
    }
    

    Option 3 (direct import / explicit function call)

    Your last option combines the the previous two options and imports blackbox into your namespace, but then uses the thirdpartypkg::blackbox construct to utilize it. This is "correct" in the sense that it works. But it can be argued to be wasteful and redundant.

    The reason I say it is wasteful and redundant is that, having imported blackbox to your namespace, you're never using it. Instead, you're using the blackbox in the thirdpartypkg namespace. Essentially, blackbox now exists in two namespaces, but only one of them is ever being used. Which begs the question of why make the copy at all.

    # DESCRIPTION
    
    Imports: thirdpartypkg
    
    # NAMESPACE
    export(primary_function)
    importFrom(thirdpartypkg, blackbox)
    
    
    #' @name primary_function
    #' @export
    
    primary_function <- function(x, y, z){
      # do something here
      hidden(a = y, b = x, z = c)
    }
    
    # Unexported function
    #' @name hidden
    #' @importFrom thirdpartypkg blackbox
    
    hidden <- function(a, b, c){
      # do something here
    
      # I CAN USE blackbox HERE AS IF IT WERE PART OF MY PACKAGE
      # EVEN THOUGH I DIDN'T. CONSEQUENTLY, THE blackbox I IMPORTED
      # ISN'T BEING USED.
      thirdpartypkg::blackbox(a, c)
    }
    

    Considerations

    So which is the best approach to use? There isn't really an easy answer to that. I will say that Option 3 is probably not the approach to take. I can tell you that Wickham advises against Option 3 (I had been developing under that framework and he advised me against it).

    If we make the choice between Option 1 and Option 2, the considerations we have to make are 1) efficiency of writing code, 2) efficiency of reading code, and 3) efficiency of executing code.

    When it comes to the efficiency of writing code, it's generally easier to @importFrom thirdpartypkg blackbox and avoid having to use the :: operator. It just saves a few key strokes. This adversely affects readability of code, however, because now it isn't immediately apparent where blackbox comes from.

    When it comes to efficiency of reading code, it's superior to omit @importFrom and use thirdpartypkg::blackbox. This makes it obvious where blackbox comes from.

    When it comes to efficiency of executing code, it's better to @importFrom. Calling thirdpartypkg::blackbox is about 0.1 milliseconds slower than using @importFrom and calling blackbox. That isn't a lot of time, so probably isn't much of a consideration. But if your package uses hundreds of :: constructs and then gets thrown into looping or resampling processes, those milliseconds can start to add up.

    Ultimately, I think the best guidance I've read (and I don't know where) is that if you are going to call blackbox more than a handful of times, it's worth using @importFrom. If you will only call it three or four times in a package, go ahead and use the :: construct.