Search code examples
rvariablesplot

Modifying values of plot variable (in R)


I have the following dataframe that depicts how many days a user (Id) has had his activity tracked for different "features".

Rows: 35
Columns: 12
Groups: Id [35]
$ Id                         <chr> "1503960366", "1624580081", "1644430081", "1844505072", "1…
$ Distance_DaysTracked       <int> 48, 49, 40, 28, 28, 42, 42, 38, 32, 42, 30, 41, 38, 20, 41…
$ LoggedActivity_DaysTracked <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0…
$ Calories_DaysTracked       <int> 61, 62, 60, 62, 62, 62, 62, 62, 49, 62, 50, 61, 62, 35, 59…
$ Intensities_DaysTracked    <int> 61, 62, 60, 33, 49, 62, 62, 43, 49, 62, 50, 50, 38, 21, 58…
$ MET_DaysTracked            <int> 61, 61, 58, 61, 61, 62, 62, 61, 48, 62, 49, 60, 61, 34, 58…
$ Sleep_DaysTracked          <int> 50, 0, 8, 7, 34, 1, 59, 1, 45, 0, 0, 44, 23, 0, 53, 23, 53…
$ Steps_DaysTracked          <int> 61, 62, 60, 32, 48, 62, 62, 43, 49, 61, 50, 50, 38, 21, 59…
$ Weight_DaysTracked         <int> 3, 0, 0, 0, 2, 0, 0, 0, 1, 4, 0, 0, 0, 0, 2, 0, 1, 6, 1, 0…
$ Fat_DaysTracked            <int> 2, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0…
$ BMI_DaysTracked            <int> 3, 0, 0, 0, 2, 0, 0, 0, 1, 4, 0, 0, 0, 0, 2, 0, 1, 6, 1, 0…
$ HR_DaysTracked             <int> 0, 0, 0, 0, 0, 42, 5, 0, 32, 0, 0, 0, 27, 0, 0, 30, 0, 42,…

I want to create plots (geom_col) for every feature, where the x-axis = 'Id' and the y-axis = '..._DaysTracked'. So I need a total number of 11 graphs. I want to include a trend line, as well as a small calculation outputting the number of users who used the feature. I have created the following plot:

DaysTracked_All %>% 
  ggplot() +
  geom_col(mapping = aes(x = Id, 
                         y = Distance_DaysTracked, 
                         fill = Distance_DaysTracked), 
           width = 0.8) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90)) +
  geom_hline(yintercept = mean(DaysTracked_All$Distance_DaysTracked), 
             colour = "plum2", 
             linewidth = 1) +
  ylim(0, 65) +
  
  annotate("text",                                                    # add mean value to 
           x = 37,                                                    # trendline    
           y = mean(DaysTracked_All$Distance_DaysTracked)+2, 
           label =  round(mean(DaysTracked_All$Distance_DaysTracked), 
                          digits = 1), 
           colour = "plum2", fontface = "bold", size = 4) +
  coord_cartesian(clip = "off") +
  
  labs(tag = sprintf("nUsers: %i",                                    # add tag for number of    
                     DaysTracked_All %>%                              # people who used feature 
                       select(Id, Distance_DaysTracked) %>%
                       filter(Distance_DaysTracked != 0) %>% 
                       nrow()
                     )) +   
  
  theme(plot.tag.position = c(0.83, 0.93),                            # change pos/colour/size           
        plot.tag = element_text(colour = "plum2",                     # of tag
                                size = 14))

The graph looks like this:

Plot for column 'Distance_DaysTracked'

Now to my question. I basically need this same code chunk eleven times for each feature. I could of course simply swap out the column names each time, but is there perhaps any way of storing the graph as variable p, for example, and then just modify the values necessary (in this case every 'Distance_DaysTracked' that needs to be changed to the other column names) in the variable? I just want to reduce the lines of code necessary and not have to write these big chunks each time. The only other thing that might have to be adjusted are the values for the 'plot.tag.position', as the graphs vary in size and I want the 'nUsers'-output aligned with the legend.

I tried playing around a bit with different things, but wasn't able to get anything working.

Thank you very much for all your time and effort in advance!


Solution

  • You can use lapply like this:

    library(ggplot2)
    plots <- lapply(names(iris)[-5], 
                    \(var) ggplot(iris, aes(fill = Species, x = .data[[var]])) + 
                      geom_histogram())
    print(plots[[1]])