Search code examples
rggplot2ggpmisc

Labeling extrema with stat_peaks/stat_valleys produces duplicate labels


I extracted some longitudinal temperature data from a .nc weather dataset (ncdf4 package) and would like to label the local extrema with their respective dates from x-axis using ggplot2 and its extension ggpmisc that includes stat_peaks/stat_valleys. Oddly, all the labels read the same: "Dec 1969".

I figured the most likely culprit was that my data used for the x-axis was not formatted correctly as Date, but the x-axis displays correctly and I have checked the class of the input data to confirm. I also tried applying group=1 which resulted in no change -- I admit I am new to R and ggplot2 (more familiar with Python/Pandas) and do not completely understand what group=1 does, though it was necessary to get the line to display correctly. Perhaps this is the result of a bug?

ggplot(df_denver, aes(x=Date, y=Temp..C., group=1)) + 
  geom_line() +
  scale_x_date(date_labels="%b %Y", date_breaks = "10 years", expand=c(0,0)) +
  stat_peaks(span=24, ignore_threshold = 0.80, color="red") +
  stat_peaks(geom="text", span=24, ignore_threshold = 0.80, x.label.fmt = "%b %Y", color="red", angle=90, hjust=-0.1) +
  stat_valleys(span=24, ignore_threshold = 0.55, color="blue") +
  stat_valleys(geom="text", span=24, ignore_threshold = 0.55, x.label.fmt = "%b %Y", color="blue", angle=90, hjust=1.1) +
  labs(x="Date", y="Temp (C)", title="Monthly Air Surface Temp for Denver from 1880 on")

Here are the first 100 rows of my dataset that produce 3 peaks and 3 valleys to illustrate:

          Date    Temp..C.
1   1880-01-01  2.91287017
2   1880-02-01 -2.73586297
3   1880-03-01 -2.04185677
4   1880-04-01  0.37948364
5   1880-05-01  0.78548384
6   1880-06-01  0.44176754
7   1880-07-01 -1.06966007
8   1880-08-01 -0.53162575
9   1880-09-01 -0.29665694
10  1880-10-01 -2.08401608
11  1880-11-01 -9.46955109
12  1880-12-01 -1.52052176
13  1881-01-01 -2.53366208
14  1881-02-01 -1.88263988
15  1881-03-01 -0.06864686
16  1881-04-01  3.32321167
17  1881-05-01  1.75613177
18  1881-06-01  2.82765651
19  1881-07-01  1.76543093
20  1881-08-01  1.39409852
21  1881-09-01 -0.98141575
22  1881-10-01 -0.63346595
23  1881-11-01 -1.95676208
24  1881-12-01  3.28983855
25  1882-01-01 -0.64792717
26  1882-02-01  2.15854502
27  1882-03-01  2.91465187
28  1882-04-01  0.56616443
29  1882-05-01 -1.89441001
30  1882-06-01 -0.63149375
31  1882-07-01 -0.64883423
32  1882-08-01  0.82802373
33  1882-09-01  0.66150969
34  1882-10-01 -0.54113626
35  1882-11-01 -1.21310496
36  1882-12-01  1.30559540
37  1883-01-01 -1.41802752
38  1883-02-01 -6.39232874
39  1883-03-01  2.96320987
40  1883-04-01 -0.48122203
41  1883-05-01 -0.99614143
42  1883-06-01 -0.67229420
43  1883-07-01 -0.56595141
44  1883-08-01  0.52161294
45  1883-09-01  0.09190032
46  1883-10-01 -2.65115738
47  1883-11-01  1.88332438
48  1883-12-01 -0.19942272
49  1884-01-01 -0.34669495
50  1884-02-01 -2.21085262
51  1884-03-01  0.55254096
52  1884-04-01 -1.21859336
53  1884-05-01 -0.40969065
54  1884-06-01  0.44454563
55  1884-07-01  1.28881764
56  1884-08-01 -1.09331822
57  1884-09-01  1.52377772
58  1884-10-01  1.76569140
59  1884-11-01  0.72411090
60  1884-12-01 -4.64927006
61  1885-01-01 -1.03242493
62  1885-02-01 -0.79325873
63  1885-03-01  0.65910935
64  1885-04-01 -0.10181000
65  1885-05-01 -1.50702798
66  1885-06-01 -1.25801849
67  1885-07-01 -0.88433135
68  1885-08-01 -1.18410277
69  1885-09-01  0.15284735
70  1885-10-01 -0.91721576
71  1885-11-01  1.82403481
72  1885-12-01  1.68553519
73  1886-01-01 -4.21202993
74  1886-02-01  2.43953681
75  1886-03-01 -2.24947429
76  1886-04-01 -1.22557247
77  1886-05-01  2.66594267
78  1886-06-01 -0.21662886
79  1886-07-01  1.09909940
80  1886-08-01  0.63720244
81  1886-09-01 -0.11845125
82  1886-10-01  0.49225059
83  1886-11-01 -3.16969180
84  1886-12-01  2.18220520
85  1887-01-01  0.51427501
86  1887-02-01 -0.69656581
87  1887-03-01  3.96693182
88  1887-04-01  0.92614591
89  1887-05-01  1.66550291
90  1887-06-01  1.88668025
91  1887-07-01 -1.48990893
92  1887-08-01 -0.98355341
93  1887-09-01  0.93172997
94  1887-10-01 -1.12551820
95  1887-11-01  1.07798636
96  1887-12-01 -2.15758419
97  1888-01-01 -1.69266903
98  1888-02-01  2.55955243
99  1888-03-01 -1.83599913
100 1888-04-01  3.63450384

As you can see, the labels produced by stat_peaks and stat_valleys are identical and not even within the range of the abbreviated data, rather than the correct dates corresponding to the x-axis.

Monthly Air Surface Temp for Denver from 1880 on


Solution

  • stat_peaks and stat_valleys labels will work with dates in POSIXct format:

    df_denver$Date <- as.POSIXct(df_denver$Date, format = "%Y-%m-%d")
    
    ggplot(df_denver, aes(x=Date, y=Temp)) + 
      geom_line() +
      scale_x_datetime(date_labels="%b %Y", date_breaks = "1 year", expand=c(0,0)) +
      stat_peaks(span=24, ignore_threshold = 0.80, color="red") +
      stat_peaks(geom="text", span=24, ignore_threshold = 0.80, x.label.fmt = "%b %Y", color="red", angle=90, hjust=-0.1) +
      stat_valleys(span=24, ignore_threshold = 0.55, color="blue") +
      stat_valleys(geom="text", span=24, ignore_threshold = 0.55, x.label.fmt = "%b %Y", color="blue", angle=90, hjust=1.1) +
      labs(x="Date", y="Temp (C)", title="Monthly Air Surface Temp for Denver from 1880 on") +
      expand_limits(y = 6)
    

    Note: scale_x_date was changed to scale_x_datetime. In addition, changed date_breaks to 1 year to demonstrate x-axis labels for example data, and expand_limits to ensure peak labels are readable. group=1 should not be needed.

    ggplot with stat_peaks and stat_valleys labelled