Search code examples
rdplyrsummarize

How can I create a column or summary table that lists the available years of data that a value has?


So I have the following data that looks like this (its a sample, with hundreds of rows over a span of ~15 years). I would like to create a column (or summary table) that displays the years of available data for each location.

Year, Place
2000, 'Adak'
2000, 'Kodiak'
2000, 'Saltsdale'
2001, 'Adak'
2001, 'Saltsdale'
2001, 'Tawney'
2002, 'Adak'
2002, 'Kodiak'
2002, 'Tawney'

I would really like it to look like this if possible:

Place, Years_Available 
'Adak', 2000/2002
'Kodiak', 2000/2002
'Saltsdale', 2000/2001
'Tawney', 2001/2002

I have used summarize_all but its giving me a weird output where the Years_Available column is just repeating the first year 15 times.

b1 <- b %>% 
    group_by(Place) %>%
    mutate(years = toString(Year)) %>%
    group_by(Place,years) %>%
    summarize_all(funs(sum(!is.na(.))))

Solution

  • We can use range

    df1 %>%
       group_by(Place) %>% 
       summarise(Year = toString(range(Year)))