Right class for yearly / bi-annual data

I was wondering which class or column specification suits best for yearly or bi-annual data.

So if I have a time variable in a dataframe which is looking like the following and I don't want it to be converted to a date (%Y/%M/%D), then what is the best way to specify it? Should I stick with the class R proposes (here double, in my case numerical) or should I transform it to another class?

# A tibble: 4 x 1
1  1997
2  1998
3  1999
4  2000`

In the case of bi-annual data the question is even more relevant for me because dealing with a character class is I guess sometimes not really convenient..

# A tibble: 4 x 2
  b_time_variant1 b_time_variant2
  <chr>           <chr>          
1 1997:Jun        1997:6         
2 1997:Dec        1997:12        
3 1998:Jun        1998:6         
4 1998:Dec        1998:12  `

What do you recommend? Should I change the variable / class specification or proceed with the classes above.

Any comment or advise is appreciated!


PS: I have looked into the zoo package, it offers a yearqtr function, so maybe something similar exists for bi-annual data..


  • Doubles or integers seem suitable to represent yearly data.

    For bi-annual data yearmon could be used in which case the internal representation is year + fraction where fraction is 5/12 or 11/12 so there is a difference of 1 between the same half in successive years and 0.5 between successive half years.

    x <- c("1997:Jun", "1997:Dec", "1998:Jun", "1998:Dec")
    x2 <- as.yearmon(x, "%Y:%b"); x2
    ## [1] "Jun 1997" "Dec 1997" "Jun 1998" "Dec 1998"
    x2 + 0.5  # next half year
    ## [1] "Dec 1997" "Jun 1998" "Dec 1998" "Jun 1999"
    diff(x2)  # differences between successive entries
    ## [1] 0.5 0.5 0.5

    Another possibility is to represent each half year using yearmon using its first month rather than last month in which case the first half would be the year internally and the second half would be the year plus 1/2.

    x3 <- x2 - 5/12; x3
    ## [1] "Jan 1997" "Jul 1997" "Jan 1998" "Jul 1998"

    With yearqtr the internal form is year + fraction where fraction is 1/4 or 3/4 so there is still 0.5 between successive half years.

    x4 <- as.yearqtr(x, "%Y:%b"); x4
    ## [1] "1997 Q2" "1997 Q4" "1998 Q2" "1998 Q4"
    x4 + 0.5  # next half year
    ## [1] "1997 Q4" "1998 Q2" "1998 Q4" "1999 Q2"
    diff(x4)  # differences between successive entries
    ## [1] 0.5 0.5 0.5
    x5 <- x4 - 1/4; x5
    ## [1] "1997 Q1" "1997 Q3" "1998 Q1" "1998 Q3"

    We could define a subclass of yearqtr, say. Here a yearhalf object is represented internally as year + 0 or year + 1/2. Below we have only defined a few methods so if you transform a yearhalf object it will likely be necessary to convert it back to yearhalf which may be sufficient. It would be possible to define all the yearqtr methods for yearhalf objects.

    as.yearhalf <- function(x, ...) {
      x <- as.yearqtr(x, ...)
      x <- as.integer(x) + floor(2 * (x %% 1)) / 2
      structure(x, class = c("yearhalf", "yearqtr"))
    as.yearqtr.yearhalf <- function(x, ...) {
      structure(x, ..., class = "yearqtr")
    format.yearhalf <- function(x, ...) {
      sub("Q[34]", "H2", sub("Q[12]", "H1", format(as.yearqtr(x), ...)))
    yh <- as.yearhalf(x2)
    ## [1] "1997 H1" "1997 H2" "1998 H1" "1998 H2"
    as.yearhalf(yh + .5)
    ## [1] "1997 H2" "1998 H1" "1998 H2" "1999 H1"