Search code examples
rggplot2boxplotline-plot

ggplot: R code to make box plot and line plots for multiple level simulation data faceted by values


Please I need help with making plots with this data, I am very new to using R but I know I can achieve my plots with it. If this particular question has been answered, I am sorry about that. I perhaps did not know what exactly to search for, since I have been searching here for days and not getting what I want. The data is a simulation of networkx graph. This is a multiple level of loops to generate the data. I have to factors "scf" and "ran". Then for each of these, I have sizes between 500 and 15,000. For each size I have percentages 1%, 2%, 3%, 5%, 7% and 10%. For each of these percentages, the iteration runs for 10 times, 0 to 9. I also have 2 factors showing this data before (pre) and after (post) manipulation.

What I want to achieve include 1 boxplot of the diameter with the pre and post values side-by-side for each percentage measured, faceted by the sizes 2 line plots of the pre and post values of numE all on a single plot faceted by sizes 3 line plots of the pre and post values of density faceted by percentP for each group of sizes separately.

I have tried different things but not getting my output since I am new to R.

  ggplot(simulation3_mod_500, aes(x = simulation3_mod_500$percentP,     y=simulation3_mod_500$density, group=simulation3_mod_500$node_percent, col=simulation3_mod_500$status)) + 
  geom_line() + 
  xlab('Percent') +
  ylab('Density')  

And this too...

ggplot(data = simulation3_mod, mapping = aes(x = simulation3_mod$percentP, y = simulation3_mod$density, color = simulation3_mod$status)) +
  geom_line() +
  facet_wrap(~ simulation3_mod_500$density)

And this...

ggplot(simulation3_mod_500, aes(x=simulation3_mod_500$percentP , y=simulation3_mod_500$density, fill=simulation3_mod_500$status)) +
  geom_boxplot() + 
  facet_wrap(~ simulation3_mod_500$percentP)  

Here is part of my data, take note this is for size 500, there are other sizes there too

simulation3_mod_500 <-
  data.frame(
    stringsAsFactors = FALSE,
    percentP = c(
      "1%", "1%", "1%", "1%", "1%", "1%", "1%", "1%", "1%", "1%",
      "1%", "1%", "1%", "1%", "1%", "1%", "1%", "1%", "1%", "1%",
      "2%", "2%", "2%", "2%", "2%", "2%", "2%", "2%", "2%", "2%", "2%",
      "2%"
    ),
    density = c(
      0.008577154, 0.008661514, 0.008661514, 0.008764242,
      0.008764242, 0.008877907, 0.008877907, 0.008968337,
      0.008968337, 0.008918499, 0.008918499, 0.008955224, 0.008955224,
      0.009093437, 0.009093437, 0.009235578, 0.009235578, 0.009362444,
      0.009362444, 0.009522395, 0.009522395, 0.009719645, 0.009719645,
      0.010011171, 0.010011171, 0.010173328, 0.010173328, 0.009743716,
      0.009743716, 0.010035448, 0.010035448, 0.010049866
    ),
    status = c(
      "pre", "post", "pre", "post", "pre", "post", "pre", "post",
      "pre", "post", "pre", "post", "pre", "post", "pre", "post",
      "pre", "post", "pre", "post", "pre", "post", "pre", "post", "pre",
      "post", "pre", "post", "pre", "post", "pre", "post"
    )
  )


Solution

  • @Camille had the right idea, you don't need to call simulation3_mod_500$ in your aes()

    ggplot(simulation3_mod_500, aes(x = percentP , y = density, fill = status)) +
      geom_boxplot() +
      facet_wrap(~percentP, scales = "free_x")  
    

    I added scales = "free_x" is so that both charts don't have 1% and 2%, they just have the x values that are present within the facet.

    enter image description here