Search code examples
rggplot2line-plot

Struggling with a lineplot (ggplot)


I have the following (sample) data:

testdata <- data.frame(theft=sample(size=100, c("yes", "no"), replace=T),
                   assault=sample(size=100, c("yes", "no"), replace=T),
                   robbery=sample(size=100, c("yes", "no"), replace=T),
                   agegrp=sample(size=100, c("10-20", "21-40", ">40"), replace=T))

theft <- table(testdata$theft, testdata$agegrp)[2,]
assault <- table(testdata$assault, testdata$agegrp)[2,]
robbery <- table(testdata$robbery, testdata$agegrp)[2,]

table <- rbind(theft, assault, robbery)

My goal is to create a line-plot (with ggplot) showing three different lines (for each offence type) over the age groups. Do I first have to re-arrange them into something like this?

offence  agegrp   count
/--------/--------/---------
theft    >40      22
theft    10-20    11
theft    21-40    22
...      ...      ...

How can I do this (not manually)? And how would I plot it then?

ggplot(data, aes(x=agegrp, y=count, color=offence) + geom_line()

Solution

  • You don't need to create table dataset if you manage to reshape your original dataset and then plot:

    library(tidyverse)
    
    testdata %>%
      group_by(agegrp) %>%                # for each age group
      summarise_all(~sum(.=="yes")) %>%   # count "yes" in all columns
      gather(offence,count,-agegrp) %>%   # reshape data
      mutate(agegrp = factor(agegrp, levels = c("10-20","21-40",">40"))) %>%  # specify order of levels (useful for plotting)
      ggplot(aes(x=agegrp, y=count, color=offence, group=offence)) + 
      geom_line()
    

    enter image description here