Search code examples
rdataframeoutputreport

Print or Retain output from each run of a loop in R markdown notebook


I am trying to make a report on the data contents of each dataframe in a list of dataframes, by looping over the list and running head() and tail() on each one. However, the output in RStudio to the notebook is always just the last one.

Example:

# Initialize Variables
list_of_df <- list()
dummy_v1 <- c(1:3)
dummy_v2 <- c(4:6)
dummy_v3 <- 1

for (i in 1:3) {
  dummy_v3 <- c(i:i+3)
  dummy_v3
  dummy_df <- data.frame(dummy_v1, dummy_v2, dummy_v3)
  list_of_df[[i]] <- dummy_df
}


# Illustrate my Problem

# I would like to print the following output into the notebook report in a for loop for each dataframe in list_of_df:
head(list_of_df[[1]])
tail(list_of_df[[1]])

for (i in 1:length(list_of_df)) {
  head(list_of_df[[i]])
  tail(list_of_df[[i]])
}

# output only shows the last iteration's head and tails.

In reality, I have 130 dataframes, each with around 700 timeseries observations of 116 or 117 variables, so I need to do this programmatically. I want to get this output so that I can do a quick sanity check on the dataframes and then proceed with the timeseries analysis. TIA for your help!


Answer!!!

lapply is a funny function for us R newbies; the syntax doesn't explicitly pass arguments to the function that is applied. The solution that worked for me was syntactically elegant:

Report <- function(x) {
  list(sprintf("Details of %s", names(x)),
    sprintf("Columns: %d", ncol(x)),
    sprintf("Rows: %d", nrow(x)),
    head(x),
    tail(x))
}

lapply(list_of_df, Report)

The output is not so elegant but it is workable and allows me to quickly peruse the results.


Solution

  • If we want separate output, use a list as return in base R without loading any external packages

    lapply(list_of_df, function(x) list(head(x), tail(x)))
    

    With for loop, it needs to be stored in an object

    out <- vector('list', length(list_of_df))
    for(i in seq_along(list_of_df)) {
        out[[i]][["head"]] <- head(list_of_df[[i]])
        out[[i]][["tail"]] <-  tail(list_of_df[[i]])
    }
    

    Or in a single line

    for(i in seq_along(list_of_df)) {
       out[[i]][c("head", "tail")] <- list(head(list_of_df[[i]]), tail(list_of_df[[i]]) )
        }