Search code examples
rloopsggplot2fwrite

How to generate A LIST OF datasets & graphs and EXPORTING them?


Here is the dataset :

# dataset call DT
DT <- data.table(
Store = rep(c("store_A","store_B","store_C","store_D","store_E"),4),
Amount = sample(1000,20))

I have TWO targets have to achieve :

  • 1.Generate INDEPENDENT Grouped dataset for exporting EXCEL.CSV files.
  • 2.Generate INDEPENDENT Graph for exporting PNG files.

*Not Necessary to run both in one operation.

Constraints : I can only perform these with ONE by ONE basic operation like :

# For dataset & CSV export
store_A <- DT %>% group_by(Store) %>% summarise(Total = sum(Amount))

fwrite(store_A,"PATH/store_A.csv")

store_B <- DT %>% group_by(Store) %>% summarise(Total = sum(Amount))

fwrite(store_B,"PATH/store_A.csv")
.....
# For graph :

Plt_A <- ggplot(store_A,aes(x = Store, y = Total)) + geom_point()

ggsave("PATH/Plt_A.png")

Plt_B <- ggplot(store_B,aes(x = Store, y = Total)) + geom_point()

ggsave("PATH/Plt_B.png")
.....

*Approaches written by ' for - loops ' can be found but confusing which is more efficient and WORKS in generate graph, for loops VS lapply family -- As real dataset has over 2 millions rows 70 cols and 10k groups to generate, for loops maybe runned terribly SLOW and crash R itself. The bottleneck in actual dataset contains 10k of "Store" groups.


Solution

  • As everything needs to be in loop:

    require(tidyverse)
    require(data.table)
    
    setwd("Your working directory")
    
    # dataset call DT
    DT <- data.table(
      Store = rep(c("store_A","store_B","store_C","store_D","store_E"),4),
      Amount = sample(1000,20)) %>% 
      #Arrange by store and amount
      arrange(Store, Amount) %>% 
      #Nesting by store, thus the loop counter/index will go by store
      nest(-Store)
    
    #Export CSVs by store
    i <- 1
    for (i in 1:nrow(DT)) {
        write.csv(DT$data[i], paste(DT$Store[i], "csv", sep = "."))
      }
    
    #Export Graphs by store
    i <- 1
    for (i in 1:nrow(DT)) {
      Graph <- DT$data[i] %>% 
        as.data.frame() %>%
        ggplot(aes(Amount)) + geom_histogram()
    
      ggsave(Graph, file = paste0(DT$Store[i],".png"), width = 14, height = 10, units = "cm")
    
    }