I am trying to make a report on the data contents of each dataframe in a list of dataframes, by looping over the list and running head() and tail() on each one. However, the output in RStudio to the notebook is always just the last one.
Example:
# Initialize Variables
list_of_df <- list()
dummy_v1 <- c(1:3)
dummy_v2 <- c(4:6)
dummy_v3 <- 1
for (i in 1:3) {
dummy_v3 <- c(i:i+3)
dummy_v3
dummy_df <- data.frame(dummy_v1, dummy_v2, dummy_v3)
list_of_df[[i]] <- dummy_df
}
# Illustrate my Problem
# I would like to print the following output into the notebook report in a for loop for each dataframe in list_of_df:
head(list_of_df[[1]])
tail(list_of_df[[1]])
for (i in 1:length(list_of_df)) {
head(list_of_df[[i]])
tail(list_of_df[[i]])
}
# output only shows the last iteration's head and tails.
In reality, I have 130 dataframes, each with around 700 timeseries observations of 116 or 117 variables, so I need to do this programmatically. I want to get this output so that I can do a quick sanity check on the dataframes and then proceed with the timeseries analysis. TIA for your help!
lapply is a funny function for us R newbies; the syntax doesn't explicitly pass arguments to the function that is applied. The solution that worked for me was syntactically elegant:
Report <- function(x) {
list(sprintf("Details of %s", names(x)),
sprintf("Columns: %d", ncol(x)),
sprintf("Rows: %d", nrow(x)),
head(x),
tail(x))
}
lapply(list_of_df, Report)
The output is not so elegant but it is workable and allows me to quickly peruse the results.
If we want separate output, use a list
as return in base R
without loading any external packages
lapply(list_of_df, function(x) list(head(x), tail(x)))
With for
loop, it needs to be stored in an object
out <- vector('list', length(list_of_df))
for(i in seq_along(list_of_df)) {
out[[i]][["head"]] <- head(list_of_df[[i]])
out[[i]][["tail"]] <- tail(list_of_df[[i]])
}
Or in a single line
for(i in seq_along(list_of_df)) {
out[[i]][c("head", "tail")] <- list(head(list_of_df[[i]]), tail(list_of_df[[i]]) )
}