Search code examples
rplotlinegraph

How do I plot TRUE values over time in R?


For example I have a data frame like this:

Year|Value
2013|TRUE
2013|TRUE
2013|TRUE
2013|TRUE
2013|FALSE
2013|FALSE
2013|TRUE
2013|FALSE
2014|TRUE
2014|FALSE
2014|FALSE
2014|TRUE
2015|TRUE
2015|TRUE
2015|FALSE
2015|FALSE
2015|TRUE
2015|TRUE

I want to plot a line graph of total amounts of truth per year.

I have tried

data <- data.frame('t'=year, 'a'=Value)
plot(data)

...but it gives year at the x-axis and on y-axis either 0 or 1 (which is true or false. rather than the number of TRUEs per year.


Solution

  • I want to plot a line graph of total amounts of truth per year.

    The trick is in transforming your data to show what you'd like your plot to show: the truth count for each year, not each observed year-boolean.

    Here's a dplyr approach to reducing the data. It filters for TRUE values and then counts how many rows of TRUE values appear for each year.

    reduce

    library(dplyr)
    library(ggplot2)
    
    tab = structure(list(Year = c(2013L, 2013L, 2013L, 2013L, 2013L, 2013L, 2013L, 2013L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L), Value = c(TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE)), .Names = c("Year", "Value"), class = "data.frame", row.names = c(NA, -18L))
    tab_sum = tab %>% group_by(Year) %>%
      filter(Value) %>%
      summarise(trues = n()) 
    # Source: local data frame [3 x 2]
    # 
    #    Year trues
    #   (int) (int)
    # 1  2013     5
    # 2  2014     2
    # 3  2015     4
    

    plot

    Now each row in the data gives an x and y pair for the plot:

    ggplot(tab_sum, aes(Year, trues)) + geom_line()