Search code examples
rgraphplottime-seriestimeserieschart

R plots: simple statistics on data by year. Base package


How to apply simple statistics to data and plot them elegantly by year using the R base plotting system and default functions? The database is quite heavy, hence do not generate new variables would be preferable.

I hope it is not a silly question, but I am wondering about this problem without finding a specific solution not involving additional packages such as ggplot2, dplyr, lubridate, such as the ones I found on the site:

ggplot2: Group histogram data by year

R group by year

Split data by year

The use of the R default systems is due to didactic purposes. I think it could be an important training before turn on the more "comfortable" R specific packages.

Consider a simple dataset:

> prod_dat

lab      year        production(kg)

1        2010        0.3219
1        2011        0.3222
1        2012        0.3305
2        2010        0.3400
2        2011        0.3310
2        2012        0.3310
3        2010        0.3400
3        2011        0.3403
3        2012        0.3410

I would like to plot with an histogram of, let's say, the total production of material during specific years.

> hist(sum(prod_dat$production[prod_dat$year == c(2010, 2013)]))

Unfortunately, this is my best attempt, and it trow an error:

in prod_dat$year == c(2010, 2012):
longer object length is not a multiple of shorter object length 

I am really out of route, hence any suggestion can turn in use.


Solution

  • without ggplot I used to do it like this but there are smarter way I think

    all <- read.table(header = TRUE, stringsAsFactors = FALSE, text = "lab      year        production
    
                      1        2010        1
                      1        2011        0.3222
                      1        2012        0.3305
                      2        2010        0.3400
                      2        2011        0.3310
                      2        2012        0.3310
                      3        2010        0.3400
                      3        2011        0.3403
                      3        2012        0.3410")
    
    
    
    
    ar <- data.frame(year = unique(all$year), prod = tapply(all$production, list(all$year), FUN = sum))
    barplot(ar$prod)
    

    enter image description here