This is a subset of my data:
structure(list(First.Name = c(5006L, 5006L, 5007L, 5007L, 5008L,
5009L), Session = c("Post", "Pre", "Post", "Pre", NA, "Post"),
RHR = c(65.2352941176471, 60, 62.5882352941176, 63, 63.4,
48.6060606060606), HRV = c(79.1470588235294, 73.5, 91.4117647058823,
80.5555555555556, 102.4, 146.606060606061), Hours.in.Bed = c(6.76441176470588,
6.325, 5.98058823529412, 4.86, 6.503, 5.43787878787879),
Hours.of.Sleep = c(5.88058823529412, 5.59833333333333, 4.89117647058824,
3.93666666666667, 5.933, 5.10484848484848), Sleep.Disturbances = c(6.85294117647059,
6.66666666666667, 4.52941176470588, 3.55555555555556, 5.2,
2.93939393939394), Latency.min = c(6.96558823529412, 3.31333333333333,
3.77411764705882, 2.81333333333333, 2.88, 2.90424242424242
), Cycles = c(5.73529411764706, 5.83333333333333, 3.23529411764706,
2.22222222222222, 5, 3.33333333333333), REM.Sleep.hours = c(1.42970588235294,
1.55, 0.466470588235294, 0.413333333333333, 1.42, 0.698181818181818
), Deep.Sleep.hours = c(0.612058823529412, 0.55, 1.17352941176471,
0.972222222222222, 0.68, 1.73909090909091), Light.Sleep.hours = c(3.83647058823529,
3.49666666666667, 3.25058823529412, 2.55111111111111, 3.835,
2.66636363636364), Awake.hours = c(0.881764705882353, 0.723333333333333,
1.08764705882353, 0.92, 0.568, 0.333030303030303), Missing.Data.hours = c(0,
0, 0, 0, 0, 0), Respiratory.Rate = c(NaN, NaN, NaN, NaN,
NaN, NaN), Year_Day = c(147.852941176471, 127.5, 145.117647058824,
129.888888888889, 130.5, 146), Week_Year = c(21.5588235294118,
18.6666666666667, 21.1764705882353, 19, 19.1, 21.2727272727273
)), row.names = c(NA, -6L), groups = structure(list(First.Name = 5006:5009,
.rows = structure(list(1:2, 3:4, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, 4L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
which looks like:
First.Name Session RHR HRV Hours.in.Bed Hours.of.Sleep Sleep.Disturbances Latency.min Cycles REM.Sleep.hours Deep.Sleep.hours Light.Sleep.hou~
<int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 5006 Post 65.2 79.1 6.76 5.88 6.85 6.97 5.74 1.43 0.612 3.84
2 5006 Pre 60 73.5 6.32 5.60 6.67 3.31 5.83 1.55 0.55 3.50
3 5007 Post 62.6 91.4 5.98 4.89 4.53 3.77 3.24 0.466 1.17 3.25
4 5007 Pre 63 80.6 4.86 3.94 3.56 2.81 2.22 0.413 0.972 2.55
5 5008 NA 63.4 102. 6.50 5.93 5.2 2.88 5 1.42 0.68 3.84
6 5009 Post 48.6 147. 5.44 5.10 2.94 2.90 3.33 0.698 1.74 2.67
I am trying to subtract specific rows, across all columns, based on the Session
column. Specifically, Post
- Pre
across columns per First.Name
ID. However, some IDs are missing either the Pre
or Post
values, or both.
So for example:
The column RHR
for subject 5006
will be Post - Pre
or 65.2-60
and so on across the columns.
I have tried variations of
DF %>%
group_by(First.Name) %>%
summarise(RHR[Session == "Post"] - RHR[Session == "Pre"])
But I'm sure there is a way to use summarize or apply functions that don't require mutating new columns for the difference in values. Help appreciated.
Maybe use summarize across to get all the rows at once.
summarize(df%>%group_by(First.Name), across(RHR:Week_Year, function(x) {return(x[1]-x[2])}))
First.Name RHR HRV Hours.in.Bed Hours.of.Sleep Sleep.Disturbances Latency.min Cycles
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 5006 5.24 5.65 0.439 0.282 0.186 3.65 -0.0980
2 5007 -0.412 10.9 1.12 0.955 0.974 0.961 1.01
3 5008 NA NA NA NA NA NA NA
4 5009 NA NA NA NA NA NA NA