Search code examples
rggplot2cdf

plotting CDF plots for various arrays in a data frame


i can plot cummulative distribution plots for 3 data series using

library(ggplot2)

a1 <- rnorm(1000, 0, 3)
a2 <- rnorm(1000, 1, 4)
a3 <- rnorm(800, 2, 3)

    df <- data.frame(x = c(a1, a2, a3), ggg=factor(rep(1:3, c(1000,1000,800))))
    ggplot(df, aes(x, colour = ggg)) + 
      stat_ecdf()+
      scale_colour_hue(name="my legend", labels=c('AAA','BBB', 'CCC'))

but now i have around 100 observed data for example a1,a2 ......a100 with 5000 rows and i want cummulative distribution plots all together but i dont want to use loop rather i want to use functions like apply or tapply and ggplot package.

**sample data :df = data.frame(matrix(rnorm(20), nrow=5000,ncol=100)).**

Solution

  • You could try using ls mget combination, for example

    a1 <- rnorm(1000, 0, 3)
    a2 <- rnorm(1000, 1, 4)
    a3 <- rnorm(800, 2, 3)
    a100 <- rnorm(800, 2, 3) # <- adding some more vectors
    a200 <- rnorm(800, 2, 3) # <- adding some more vectors 
    a300 <- rnorm(800, 2, 3) # <- adding some more vectors 
    a1000 <- rnorm(800, 2, 3) # <- adding some more vectors
    
    temp <- mget(ls(pattern = "^a\\d+$"))
    df <- data.frame(x = unlist(temp), ggg = factor(rep(seq_len(length(temp)), sapply(temp, length))))
    ggplot(df, aes(x, colour = ggg)) + 
      stat_ecdf()+
      scale_colour_hue(name="my legend", labels=names(temp))
    

    enter image description here


    Edit: Per your new question, try this on your df (it won't look so good on the provided df because all the values are equal in all the columns)

    library(reshape2)
    df2 <- melt(df)
    df2$x <- rep(seq_len(nrow(df)), ncol(df))
    ggplot(df2, aes(x, value, color = variable)) + 
      stat_ecdf()+
      scale_colour_hue(name="my legend", labels=names(df))