I have this data set
df <- data.frame(year = seq(1970, 2015, by = 5),
staff = c(219, 231, 259, 352, 448, 427, 556, 555, 602, 622),
applications = c(5820, 7107, 6135, 16119, 19381, 36611, 54962, 45759, 40358, 458582))
I want to perform the exploratory analysis and want to compare whether the staff strength is growing according to the applications received. I plotted a line graph using excel :
which isn't very meaningful. I've also taken the log of both variables which almost got the desired result but i wonder if the graphs with log are less explainable to non-mathematicians. Since i want to use these kind of graphs in a presentation to my managerial staff who don't know much of statistics or mathematics. My question is how to tackle this situation in order to draw a meaningful graph. I've a gut feeling that R might have a better solution(that is why i asked here ) than Excel but the problem is 'How'?
Any help will be highly appreciated.
One recommendation would be to change your measure into some type of ratio metric. For example, staff per applications
. In the following, I will use staff per 1,000 applications
:
library(ggplot2)
df <- data.frame(year = seq(1970, 2015, by = 5),
staff = c(219, 231, 259, 352, 448, 427, 556, 555, 602, 622),
applications = c(5820, 7107, 6135, 16119, 19381, 36611, 54962, 45759, 40358, 458582))
ggplot(data = df, aes(x = year, y = staff / (applications / 1000))) +
geom_point(size = 3) +
geom_line() +
ggtitle("Staff per 1,000 Applications")
We can achieve the same result without ggplot2
with:
with(df,
plot(x = year, y = staff / (applications / 1000), type = "l", main = "Staff per 1,000 Applications") +
points(x = year, y = staff / (applications / 1000), pch = 21, cex = 2, bg = "black")
)
Alternatively, you could make your dataset a little more tidy (see this, this, and/or this for more information) and plot them two facets with free_y
scales:
library(tidyr)
df_tidy <- gather(df, measure, value, -year)
ggplot(data = df_tidy, aes(x = year, y = value)) +
geom_point(size = 3) +
geom_line() +
facet_grid(measure ~ ., scales = "free_y")