Search code examples
rggplot2density-plot

How to incorporate data into plot which was constructed in ggplot2 using data from another file (R)?


Using a dataset, I have created the following plot:

enter image description here

I'm trying to create the following plot:

enter image description here

Specifically, I am trying to incorporate Twitter names over the first image. To do this, I have a dataset with each name in and a value that corresponds to a point on the axes. A snippet looks something like:

Name             Score
@tedcruz         0.108
@RealBenCarson   0.119

Does anyone know how I can plot this data (from one CSV file) over my original graph (which is constructed from data in a different CSV file)? The reason that I am confused is because in ggplot2, you specify the data you want to use at the start, so I am not sure how to incorporate other data.

Thank you.


Solution

  • The question you ask about ggplot combining source of data to plot different element is answered in this post here

    Now, I don't know for sure how this is going to apply to your specific data. Here I want to show you an example that might help you to go forward.

    Imagine we have two data.frames (see bellow) and we want to obtain a plot similar to the one you presented.

    data1 <- data.frame(list(
      x=seq(-4, 4, 0.1), 
      y=dnorm(x = seq(-4, 4, 0.1))))
    data2 <- data.frame(list(
      "name"=c("name1", "name2"), 
      "Score" = c(-1, 1)))
    

    The first step is to find the "y" coordinates of the names in the second data.frame (data2). To do this I added a y column to data2. y is defined here as a range of points from the may value of y to the min value of y with some space for aesthetics.

    range_y = max(data1$y) - min(data1$y)
    space_y = range_y * 0.05
    data2$y <- seq(from = max(data1$y)-space, to = min(data1$y)+space, length.out = nrow(data2))
    

    Then we can use ggplot() to plot data1 and data2 following some plot designs. For the current example I did this:

    library(ggplot2)
    p <- ggplot(data=data1, aes(x=x, y=y)) + 
      geom_point() + # for the data1 just plot the points
      geom_pointrange(data=data2, aes(x=Score, y=y, xmin=Score-0.5, xmax=Score+0.5)) +
      geom_text(data = data2, aes(x = Score, y = y+(range_y*0.05), label=name))
    p 
    

    which gave this following plot:

    Example ggplot with two data