I download a monthly time series of unemployment rates from the Federal Reserve using alfred
df <- alfred::get_alfred_series("UNRATE")
As unemployment data is later revised after its first release, df
contains every single observation, revised and unrevised, of UNRATE
along with the date on which the revision was posted.
> head(df)
date realtime_period UNRATE
1 1948-01-01 1960-03-15 3.5
2 1948-02-01 1960-03-15 3.8
3 1948-03-01 1960-03-15 4.0
4 1948-04-01 1960-03-15 4.0
5 1948-05-01 1960-03-15 3.6
6 1948-06-01 1960-03-15 3.8
I'm looking to filter the dataframe to find the first realtime_period
associated with each date
, and can do it with dplyr
:
df |>
mutate(Delta = realtime_period - date) |>
group_by(date) |>
filter(Delta == min(Delta)) |>
ungroup()
Question: How do I do this in base R (I'm using R 4.3.3) instead of using dplyr? I'm trying to avoid the tidyverse and stick with base R for consistency as its syntax rarely changes.
Sincerely
Thomas Philips
You can replace mutate
with transform
, and replace grouped filter
with subset
+ ave
.
df |>
transform(Delta = abs(realtime_period - date)) |>
subset(Delta == ave(Delta, date, FUN = min))
transform
and subset
are both from {base}
. ave
is from {stats}
that is still an internal package of R.