Search code examples
rplotggplot2time-seriesstacked-area-chart

unexpected numeric constant in: "ggplot(


I am trying to plot trends in the age of university applicants. From various databases I use the data to build the following dataframe:

> AgeGroup <- c("Year", "17","18","19","20", "21", "22", "23", "24", "25to29", "30to39", "40plus"); AgeGroup
 [1] "Year"   "17"     "18"     "19"     "20"     "21"     "22"     "23"     "24"    
[10] "25to29" "30to39" "40plus"

> AGEgroups <- as.data.frame(cbind(a,h,i,j, k, l, m, n, o, p, q, r)); AGEgroups
  a    h      i      j     k     l     m     n    o     p     q     r
1  2004 1053 160450  74600 25778 14317  9761  6995 5589 15902 17171  8351
2  2005 1115 175406  77751 28368 15191 10551  7778 6107 18153 18695  9686
...
9  2012  743 199213  93669 37214 21240 14651 10962 8781 26387 27246 15308
10 2013  702 201821 103356 39185 21557 15242 11226 8707 27326 26887 15442

> colnames(AGEgroups) <- AgeGroup
> AGEgroups

   Year   17     18     19    20    21    22    23   24 25to29 30to39 40plus
1  2004 1053 160450  74600 25778 14317  9761  6995 5589  15902  17171   8351
...

10 2013  702 201821 103356 39185 21557 15242 11226 8707  27326  26887  15442

Then I plot the graph using the ggplot2 library:

> ggplot(AGEgroups,aes(x=Year, y=NumerOfApplicants, fill=Age.Range)) +
+   geom_area(data = AGEgroups, aes(x=Year, y=h, fill="17 yrs"))+
+   geom_area(data = AGEgroups, aes(x=Year, y=i, fill="18 yrs"))+
+   geom_area(data = AGEgroups, aes(x=Year, y=j, fill="19 yrs"))+

...

And receive a graph, which generally looks ok (though I tried to customise the colours and failed and though you cannot see it as I do not have enough reputation points), but... only 5 age groups get plotted instead of 11...

When I try to plot them separately using:

ggplot(AGEgroups,aes(x=Year, y=NumerOfApplicants, fill=Age.Range)) +
  geom_area(data = AGEgroups, aes(x=Year, y=l, fill="21 yrs"))

the majority work fine, but then when I plot:

ggplot(AGEgroups,aes(x=Year, y=NumerOfApplicants, fill=Age.Range)) +
  geom_area(data = AGEgroups, aes(x=Year, y=m, fill="22 yrs"))

which is the missing group, I get the error message:

Error: unexpected numeric constant in:
"ggplot(AGEgroups,aes(x=Year, y=NumerOfApplicants, fill=Age.Range)) +
  geom_area(data = AGEgroups, aes(x=Year, y=m, fill="22"

I have been looking at both code lines and can see no difference in the syntax. the 'm' vector gets displayed on command. Any ideas why it might be happening?

I do not get the unexpected numeric constant error today after restarting the computer, which means the old "switch on/off" technique solves at least 50% of problems;)

Still, the graph displays 5 instead of 11 variables. The suggested dput(head(AGEgroups)) yields the following output:

structure(list(Year = 2004:2009, `17` = c(1053L, 1115L, 937L,
1023L, 1273L, 1236L), `18` = c(160450L, 175406L, 173806L, 176306L, 
187802L, 197090L), `19` = c(74600L, 77751L, 71285L, 83706L, 89462L, 
97544L), `20` = c(25778L, 28368L, 27003L, 29955L, 36255L, 38451L
), `21` = c(14317L, 15191L, 15464L, 16550L, 19745L, 22110L), 
`22` = c(9761L, 10551L, 10287L, 11498L, 13384L, 15132L),
`23` = c(6995L, 7778L, 7664L, 8054L, 9801L, 11080L), `24` = c(5589L,
6107L, 5948L, 6150L, 7470L, 8810L), `25to29` = c(15902L,
18153L, 18001L, 18833L, 23578L, 27299L), `30to39` = c(17171L,
18695L, 17818L, 17861L, 22643L, 26781L), `40plus` = c(8351L, 
9686L, 9854L, 10141L, 13183L, 15888L)), .Names = c("Year", 
"17", "18", "19", "20", "21", "22", "23", "24", "25to29", "30to39",
"40plus"), row.names = c(NA, 6L), class = "data.frame")

Solution

  • I still can't get your code above to run because it's missing all the single-letter variables and I don't want to define those manually so I can't reproduce the error.

    But a better way to plot your data would be to melt it first.

    library(reshape2)
    mm<-melt(AGEgroups, id.vars="Year")
    

    then plot with

    ggplot(mm,aes(x=Year, y=value, fill=variable)) +
      geom_area() + ylab("Number of Applicants") + 
      scale_fill_hue(name = "Age Range", 
        labels=c(paste(17:24, "yrs"),"25 to 29", "30 to 39", "40+"))
    

    which produces

    enter image description here

    Here we clearly label the plot using the more standard assignments rather than relying on the side effects of using imaginary variables in the aesthetics. This make this intention of the code much clearer.