Search code examples
rdataframeplotmultiple-columnsscatter-plot

R how to make a scatterplot where y-values are strewn across multiple columns?


I have a df with n=1136 rows each representing a subject. I am trying to create a scatterplot series for each subject with the x-values representing time intervals and the y-values representing the actual readings. In this case, we are just assuming that each reading was taken in 15 minute intervals. Lost as to how to even go about doing this without having to create individual. scatterplots for each subject.

Study ID Reading 1 Reading 2 Reading 3 .... Reading 50
123 45 23
124 56 45
125 912 56

I've tried combining all the readings into a single column but realized that I would be creating over 1,000 scatterplots to capture the entire data set.


Solution

  • Below is a ggplot2/dplyr solution. I agree with other commenters that making 1100+ individual plots is not realistic. It may be better to plot everyone in one plot. If each reading is a 15-minute interval (e.g., Reading1 represents 0 minutes to 15 minutes), then you will need to lengthen your data and combine all readings into one long column. Once the readings are combined into one column, you can transform that column into 15-minute intervals to better represent your x-axis.

    library(tidyverse)
    StudyID <- c(100:110)
    Reading1 <- sample(20:100, 11, T); Reading2 <- sample(20:50, 11, T); Reading3 <- sample(10:30, 11, T)
    Reading4 <- sample(20:100, 11, T); Reading5 <- sample(20:50, 11, T); Reading6 <- sample(10:30, 11, T)
    
    df <- data.frame(StudyID, Reading1, Reading2, Reading3, Reading4, Reading5, Reading6) %>%
      pivot_longer(cols = -StudyID, names_to = "time") %>%
      mutate(StudyID = factor(StudyID)) %>%
      mutate(time = as.numeric(str_replace(time, "Reading", ""))) %>%
      mutate(time = time*15) %>%
      mutate(lowertime = time-15) %>%
      mutate(timeinterval = paste0(lowertime,"-",time))
    
    ggplot(df, aes(x = timeinterval, y = value
                   , color = StudyID # <- remove this to prevent each subject getting its own color
                   )) +
      geom_point() + 
      ylab("Reading Value") + xlab("Time Interval in Minutes") + 
      ggtitle("Reading Plot") + theme(legend.position = "none")
    

    enter image description here