Search code examples
rggplot2plothistogramoverlay

Is there a way to make a histogram in R to display both the frequency of years an event occurred and a specific detail about that event?


I am trying to create a histogram that will display events that occurred during specific years as either fatal or non-fatal. I am able to create a histogram or yearly occurrences of these events fine in base R by using:

Confirmed_Unprovoked_Attacks_ALL <- read.csv("/path to csv/.csv")

hist(Confirmed_Unprovoked_Attacks_ALL$year, main = 'Confirmed Unprovoked Attacks', ylab = 'Number of Attacks', xlab = 'Year', breaks = 100, xlim = c(1850,2050))

However if it is possible I would like to overlay onto the same (or similar) histogram that is produced by this by coloring these events to distinguish if they were either fatal or non-fatal for that year. I have assigned numeric values to either of these conditions with "1" representing fatal and "2" representing non-fatal.

Here is what the data in the .csv file looks like:

dput((Confirmed_Unprovoked_Attack_ALL))
structure(list(year = c(1876L, 1890L, 1890L, 1894L, 1907L, 1907L, 
1909L, 1916L, 1916L, 1916L, 1922L, 1922L, 1926L, 1930L, 1934L, 
1935L, 1935L, 1936L, 1936L, 1936L, 1937L, 1937L, 1950L, 1950L, 
1951L, 1951L, 1952L, 1953L, 1954L, 1955L, 1955L, 1956L, 1956L, 
1956L, 1956L, 1957L, 1957L, 1959L, 1959L, 1960L, 1960L, 1960L, 
1960L, 1960L, 1961L, 1961L, 1962L, 1962L, 1963L, 1964L, 1964L, 
1964L, 1965L, 1966L, 1966L, 1966L, 1967L, 1968L, 1969L, 1969L, 
1969L, 1971L, 1971L, 1971L, 1971L, 1971L, 1971L, 1971L, 1971L, 
1972L, 1972L, 1973L, 1974L, 1974L, 1974L, 1974L, 1974L, 1974L, 
1975L, 1975L, 1975L, 1975L, 1976L, 1976L, 1976L, 1976L, 1976L, 
1977L, 1978L, 1979L, 1979L, 1980L, 1980L, 1980L, 1981L, 1981L, 
1981L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 
1983L, 1983L, 1984L, 1984L, 1984L, 1984L, 1984L, 1985L, 1985L, 
1985L, 1985L, 1985L, 1986L, 1986L, 1986L, 1987L, 1987L, 1987L, 
1988L, 1988L, 1988L, 1988L, 1989L, 1989L, 1989L, 1989L, 1989L, 
1989L, 1989L, 1989L, 1989L, 1990L, 1990L, 1990L, 1990L, 1990L, 
1990L, 1990L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1992L, 
1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1993L, 1993L, 1993L, 
1993L, 1993L, 1993L, 1994L, 1994L, 1994L, 1994L, 1994L, 1994L, 
1994L, 1994L, 1995L, 1995L, 1995L, 1995L, 1995L, 1995L, 1996L, 
1996L, 1996L, 1996L, 1996L, 1996L, 1996L, 1997L, 1997L, 1997L, 
1997L, 1998L, 1998L, 1998L, 1998L, 1998L, 1998L, 1998L, 1999L, 
1999L, 1999L, 1999L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 
2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2001L, 2001L, 2001L, 
2001L, 2001L, 2002L, 2002L, 2002L, 2002L, 2003L, 2003L, 2003L, 
2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 
2004L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 
2005L, 2005L, 2005L, 2005L, 2005L, 2006L, 2006L, 2006L, 2006L, 
2006L, 2006L, 2006L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 
2007L, 2007L, 2007L, 2007L, 2008L, 2008L, 2008L, 2008L, 2008L, 
2009L, 2009L, 2009L, 2009L, 2009L, 2009L, 2010L, 2010L, 2010L, 
2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2013L, 2013L, 2013L, 
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 
2015L, 2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2016L, 
2016L, 2016L, 2017L, 2017L, 2018L, 2018L, 2018L, 2018L, 2018L, 
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2019L, 2019L, 
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2020L, 
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2021L, 2021L, 
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 
2021L, 2021L, 2022L, 2022L, 2022L, 2022L, 2022L, 2022L, 2022L, 
2022L, 2023L, 2023L, 2023L, 2023L, 2023L, 2023L, 2023L, NA), 
    outcome = c(1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 
    2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
    2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
    1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 
    1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 
    2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 
    1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 
    1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
    2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
    1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
    1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
    1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L
    )), class = "data.frame", row.names = c(NA, -402L))

Haven't tried anything other than base R and ggplot2


Solution

  • If base R isn't a requirement, you can do this with ggplot2 by specifying fill=outcome in the aesthetic definitions and using position="stack" in the histogram geometry:

    output

    library(dplyr)
    library(ggplot2)
    Confirmed_Unprovoked_Attacks_ALL %>% 
      mutate(outcome = factor(outcome, levels=1:2, labels=c("Fatal", "Non-fatal"))) %>% 
      ggplot(aes(x=year, fill=outcome)) + 
      geom_histogram(position="stack", bins = 100) + 
      scale_x_continuous(limits=c(1850, 2050)) + 
      theme_classic() + 
      theme(legend.position="top") + 
      labs(x="Year", y="Count", fill="")
    #> Warning: Removed 1 rows containing non-finite values (`stat_bin()`).
    #> Warning: Removed 4 rows containing missing values (`geom_bar()`).
    

    input

    Confirmed_Unprovoked_Attacks_ALL <- structure(list(year = c(1876L, 1890L, 1890L, 1894L, 1907L, 1907L, 
    1909L, 1916L, 1916L, 1916L, 1922L, 1922L, 1926L, 1930L, 1934L, 
    1935L, 1935L, 1936L, 1936L, 1936L, 1937L, 1937L, 1950L, 1950L, 
    1951L, 1951L, 1952L, 1953L, 1954L, 1955L, 1955L, 1956L, 1956L, 
    1956L, 1956L, 1957L, 1957L, 1959L, 1959L, 1960L, 1960L, 1960L, 
    1960L, 1960L, 1961L, 1961L, 1962L, 1962L, 1963L, 1964L, 1964L, 
    1964L, 1965L, 1966L, 1966L, 1966L, 1967L, 1968L, 1969L, 1969L, 
    1969L, 1971L, 1971L, 1971L, 1971L, 1971L, 1971L, 1971L, 1971L, 
    1972L, 1972L, 1973L, 1974L, 1974L, 1974L, 1974L, 1974L, 1974L, 
    1975L, 1975L, 1975L, 1975L, 1976L, 1976L, 1976L, 1976L, 1976L, 
    1977L, 1978L, 1979L, 1979L, 1980L, 1980L, 1980L, 1981L, 1981L, 
    1981L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 
    1983L, 1983L, 1984L, 1984L, 1984L, 1984L, 1984L, 1985L, 1985L, 
    1985L, 1985L, 1985L, 1986L, 1986L, 1986L, 1987L, 1987L, 1987L, 
    1988L, 1988L, 1988L, 1988L, 1989L, 1989L, 1989L, 1989L, 1989L, 
    1989L, 1989L, 1989L, 1989L, 1990L, 1990L, 1990L, 1990L, 1990L, 
    1990L, 1990L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1992L, 
    1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1993L, 1993L, 1993L, 
    1993L, 1993L, 1993L, 1994L, 1994L, 1994L, 1994L, 1994L, 1994L, 
    1994L, 1994L, 1995L, 1995L, 1995L, 1995L, 1995L, 1995L, 1996L, 
    1996L, 1996L, 1996L, 1996L, 1996L, 1996L, 1997L, 1997L, 1997L, 
    1997L, 1998L, 1998L, 1998L, 1998L, 1998L, 1998L, 1998L, 1999L, 
    1999L, 1999L, 1999L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 
    2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2001L, 2001L, 2001L, 
    2001L, 2001L, 2002L, 2002L, 2002L, 2002L, 2003L, 2003L, 2003L, 
    2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 2004L, 
    2004L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 
    2005L, 2005L, 2005L, 2005L, 2005L, 2006L, 2006L, 2006L, 2006L, 
    2006L, 2006L, 2006L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 
    2007L, 2007L, 2007L, 2007L, 2008L, 2008L, 2008L, 2008L, 2008L, 
    2009L, 2009L, 2009L, 2009L, 2009L, 2009L, 2010L, 2010L, 2010L, 
    2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 
    2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
    2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2013L, 2013L, 2013L, 
    2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 
    2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 
    2015L, 2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2016L, 
    2016L, 2016L, 2017L, 2017L, 2018L, 2018L, 2018L, 2018L, 2018L, 
    2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2019L, 2019L, 
    2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2020L, 
    2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 
    2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2021L, 2021L, 
    2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 
    2021L, 2021L, 2022L, 2022L, 2022L, 2022L, 2022L, 2022L, 2022L, 
    2022L, 2023L, 2023L, 2023L, 2023L, 2023L, 2023L, 2023L, NA), 
        outcome = c(1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 
        1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 
        2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
        2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
        1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 
        1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 
        2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 
        1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 
        2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 
        1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
        2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
        1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
        1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
        2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 
        1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L
        )), class = "data.frame", row.names = c(NA, -402L))