Search code examples
rroxygen2

What's the best way to automatically generate roxygen2 documentation for a data frame?


In my new CRAN package I have 10 dataframes that have 10 or so columns each of various types in the data/ folder. The types are strings, int, floats, booleans, etc.

I need to add roxygen2 documentation for each of these data sources. Is there a method that autogenerates comment blocks given a data.frame?

Something like: makeDocs(games)

#' games
#'  title character
#'  score integer
#'  value numeric
#'  ...

I worry if I do it by hand I could make mistakes (~100 columns) or constantly re-edit things by hand if names change.

I found this great answer about documenting datasets How can I document data sets with roxygen?

... but that does not address how I can autogenerate these comments?


Solution

  • Start with a list of the frames' names, then something like this is a quick hack:

    frames <- c("iris","mtcars")
    unlist(sapply(frames, function(d) c(paste("#'", d), "#' @format data.frame",
                                        gsub("^","#'",capture.output(str(get(d)))),
                                        dQuote(d)),
                  simplify=FALSE), use.names=FALSE)
    #  [1] "#' iris"                                                                                    
    #  [2] "#' @format data.frame"                                                                      
    #  [3] "#''data.frame':\t150 obs. of  5 variables:"                                                  
    #  [4] "#' $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ..."                            
    #  [5] "#' $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ..."                          
    #  [6] "#' $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ..."                        
    #  [7] "#' $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ..."                        
    #  [8] "#' $ Species     : Factor w/ 3 levels \"setosa\",\"versicolor\",..: 1 1 1 1 1 1 1 1 1 1 ..."
    #  [9] "\"iris\""                                                                                   
    # [10] "#' mtcars"                                                                                  
    # [11] "#' @format data.frame"                                                                      
    # [12] "#''data.frame':\t32 obs. of  11 variables:"                                                  
    # [13] "#' $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ..."                          
    # [14] "#' $ cyl : num  6 6 4 6 8 6 8 4 4 6 ..."                                                    
    # [15] "#' $ disp: num  160 160 108 258 360 ..."                                                    
    # [16] "#' $ hp  : num  110 110 93 110 175 105 245 62 95 123 ..."                                   
    # [17] "#' $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ..."                        
    # [18] "#' $ wt  : num  2.62 2.88 2.32 3.21 3.44 ..."                                               
    # [19] "#' $ qsec: num  16.5 17 18.6 19.4 17 ..."                                                   
    # [20] "#' $ vs  : num  0 0 1 1 0 1 0 1 1 1 ..."                                                    
    # [21] "#' $ am  : num  1 1 1 0 0 0 0 0 0 0 ..."                                                    
    # [22] "#' $ gear: num  4 4 4 3 3 3 3 4 4 4 ..."                                                    
    # [23] "#' $ carb: num  4 4 1 1 2 1 4 2 2 4 ..."                                                    
    # [24] "\"mtcars\""                                                                                 
    

    Then you can cat it out to a file and have most of what you need.