Search code examples
rnumpyfor-loopmultidimensional-arrayreticulate

Loop over multidimensional NumPy fileArray in R


I´m new to R and have problems to generate a for loop: I have a multidimensional array (NumPy File) in R and would like to automate the request of the part of the array. My array has a shape of (500, 192). I would like to plot a graph for Sample 1, 2 ... until sample 500.

AP has the structure: num [1:500, 1:192] 0.0323 0.0532 0.0135 0.0474 0.2026 ...

AP.shap has the structure: num [1:192] 3.23e-02 4.88e-04 1.39e-03 7.49e-04 5.82e-05 ...

library(reticulate)
library(RcppCNPy)
library(ggplot2)

#Sample 1
AP.shap <- (AP[1 ,]) # Sample 1

# convert to dataframe
typeof(AP.shap) # type double
AP.shap <- as.data.frame(AP.shap)
typeof(AP.shap) # list

# New Column with Nrow
AP.shap$Hour <- seq.int(nrow(AP.shap))

columnnames <- c("Shap", "Hour")

colnames(AP.shap) <- columnnames 

#Plot
ggplot(AP.shap, aes(Hour, Shap))+
  geom_bar(stat = "identity") + theme_minimal() + 
  ggtitle(expression(paste("Hours of Sample 1 - Air Pressure - G_PM"[1], " - All Season"))) + 
  xlab("Hour") +
  ylab("Shap Value") +
  ggsave("1_Hour_AP_G_allseason.png", plot = last_plot(), device = "png", path = "xy")

The next one would be:

#Sample 2
AP.shap <- (AP[**2** ,]) # Sample 1

# convert to dataframe
typeof(AP.shap) # type double
AP.shap <- as.data.frame(AP.shap)
typeof(AP.shap) # list

# New Column with Nrow
AP.shap$Hour <- seq.int(nrow(AP.shap))
columnnames <- c("Shap", "Hour")
colnames(AP.shap) <- columnnames 

#Plot
ggplot(AP.shap, aes(Hour, Shap))+
  geom_bar(stat = "identity") + theme_minimal() + 
  ggtitle(expression(paste("Hours of Sample **2** - Air Pressure - G_PM"[1], " - All Season"))) + 
  xlab("Hour") +
  ylab("Shap Value") +
  ggsave("**2**_Hour_AP_G_allseason.png", plot = last_plot(), device = "png", path = "xy")

Graph Sample 1


Solution

  • Simply generalize your process into a user-defined method that receives a number as input parameter since this is the only item that changes. Have that parameter slice data for sample and pass into titles and file names. See below where num is used.

    Then, call it iteratively with for, while, or lapply which latter can store plot objects to a list for continued use later.

    build_plot <- function(num) {
    
       # Sample by num
       AP.shap <- (AP[num, ]) # Sample num
    
       # convert to dataframe
       AP.shap <- as.data.frame(AP.shap)
       # New Column with Nrow
       AP.shap$Hour <- seq.int(nrow(AP.shap))
       colnames(AP.shap) <-  c("Shap", "Hour")
    
       # Plot
       p <- ggplot(AP.shap, aes(Hour, Shap)) +
              geom_bar(stat = "identity") + theme_minimal() + 
              ggtitle(expression(paste("Hours of Sample", num, 
                                       "- Air Pressure - G_PM"[1], " - All Season"))) + 
              xlab("Hour") + ylab("Shap Value")
    
       # Save 
       ggsave(paste0(num, "_Hour_AP_G_allseason.png"), plot = p, device = "png", path = "xy")
    
       return(p)    
    }
    
    # SAVES EACH PLOT FILE AND STORES PLOT TO R LIST
    plot_list <- lapply(seq(1, 500), build_plot)
    
    # INTERACTIVELY DISPLAY PLOT
    plot_list[[1]]
    plot_list[[2]]
    plot_list[[3]]
    

    For traditional looping, see for and while approaches that do not store objects to an R list. Note: be cautious of printing 500 plots in your environment. Remove print() to simply save plots to file.

    # SAVES EACH PLOT FILE AND DISPLAYS PLOT TO SCREEN
    for (i in seq(1, 500)) {
       print(build_plot(i))
    }
    
    i = 1
    # SAVES EACH PLOT FILE AND DISPLAYS PLOT TO SCREEN
    while(i <= 500) {
       print(build_plot(i))
       i = i + 1
    }