I'm trying to use geom_col to chart columns for values in time series (annual and quarterly).
When I use Zoo package's YearQtr datatype for the x-axis values and I round the y-axis values to a whole number, geom_col appears to not use the default postion = 'identity' for determining the column bar heights based on the y-value of each occurrence. Instead it appears to switch to position = 'count' and treats the rounded y-values as factors, counting the number of occurrences for each factor value (e.g., 3 occurrences have a rounded y-value = 11)
If I switch to geom_line, the graph is fine with quarterly x-axis values and rounded y-axis values.
library(zoo)
library(ggplot2)
Annual.Periods <- seq(to = 2020, by = 1, length.out = 8) # 8 years
Quarter.Periods <- as.yearqtr(seq(to = 2020, by = 0.25, length.out = 8)) # 8 Quarters
Values <- seq(to = 11, by = 0.25, length.out = 8)
Data.Annual.Real <- data.frame(X = Annual.Periods, Y = round(Values, 1))
Data.Annual.Whole <- data.frame(X = Annual.Periods, Y = round(Values, 0))
Data.Quarter.Real <- data.frame(X = Quarter.Periods, Y = round(Values, 1))
Data.Quarter.Whole <- data.frame(X = Quarter.Periods, Y = round(Values, 0))
ggplot(data = Data.Annual.Real, aes(X, Y)) + geom_col()
ggplot(data = Data.Annual.Whole, aes(X, Y)) + geom_col()
ggplot(data = Data.Quarter.Real, aes(X, Y)) + geom_col()
ggplot(data = Data.Quarter.Whole, aes(X, Y)) + geom_col() # appears to treat y-values as factors and uses position = 'count' to count occurrences (e.g., 3 occurrences have a rounded Value = 11)
ggplot(data = Data.Quarter.Whole, aes(X, Y)) + geom_line()
rstudioapi::versionInfo()
# $mode
# [1] "desktop"
#
# $version
# [1] ‘1.3.959’
#
# $release_name
# [1] "Middlemist Red"
sessionInfo()
# R version 4.0.0 (2020-04-24)
# Platform: x86_64-apple-darwin17.0 (64-bit)
# Running under: macOS Mojave 10.14.6
#
# Matrix products: default
# BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] ggplot2_3.3.1 zoo_1.8-8
ggplot tries to guess the orientation of its geom_col()
-function, meaning which variable serves as the base of the bars and which as the values to represent. Apparently without any decimal numbers in your Y
- variable it choses it as it's base (it stays numeric though, no conversion to factor), and sums up your quarters.
For cases like this you can provide geom_col()
with the information what variable to use as the base of the bars via the orientation=
argument:
ggplot(data = Data.Quarter.Whole, aes(X, Y)) + geom_col(orientation = "x")
EDIT: I have just seen that Roman answered it in the comments.