Problem
I have a data.frame including 18 parameters measured at 6 timepoints. I would like to display the mean+standard deviation of each parameter according to each timepoint in a pivot table.
Expected output
An example table, which includes only two parameters and two timepoints (dput given below) for the sake of brevity:
<table>
<thead>
<tr>
<th style="text-align:left;"> param </th>
<th style="text-align:left;"> T0 </th>
<th style="text-align:left;"> T1 </th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;"> P1 </td>
<td style="text-align:left;"> 0.08±0.62 </td>
<td style="text-align:left;"> 0.21±0.28 </td>
</tr>
<tr>
<td style="text-align:left;"> P2 </td>
<td style="text-align:left;"> 27.52±23.28 </td>
<td style="text-align:left;"> 28.46±25.74 </td>
</tr>
</tbody>
</table>
What I tried
I failed with gtsummary
, dt
, group_by
statements etc. So I present how I got it manually:
select(starts_with("T")) %>% select(!matches("diff"))
tidyr::pivot_longer(everything(), cols_vary = "slowest", names_to = c("time", "param"), names_sep = "_")
custom_aggr <- function(x) {
s <- shapiro.test(x)
if(s$p.value < 0.05) {
paste0(round(median(x, na.rm = T), 2), " (",
round(quantile(x, na.rm = T)[2][[1]], 2), "-",
round(quantile(x, na.rm = T)[4][[1]], 2), ")")
} else {
paste0(round(mean(x, na.rm = T), 2), "±", round(sd(x, na.rm = T), 2))
}
}
tidyr::pivot_wider(names_from = "time", values_from = "value", values_fn = custom_aggr) %>% kable(format = "html)
The whole process can be applied to the example dput with:
example_dput %>%
select(starts_with("T")) %>% select(!matches("diff")) %>%
tidyr::pivot_longer(everything(), cols_vary = "slowest", names_to = c("time", "param"), names_sep = "_") %>%
tidyr::pivot_wider(names_from = "time", values_from = "value", values_fn = mnsd2) %>%
kable(format = "html")
Questions
This is a fairly common case in healthcare-related research, so I expect a more straightforward solution. Yet, I couldn't find a hint in tutorials related to table creation with R, of which the most helpful was the gtsummary
. I prefer a solution with gtsummary, which works well with rmarkdown
and exports directly to word or pdf, since I plan to repeat this analysis periodically. That's whay I tagged the question with gtsummary
.
Also, the real data contains many grouping variables, which will be used to display the data. This and the future need for a stratified display makes me appreciate a functional approach like that of gt_summary.
I would like to know if:
group
variable.Finally, I admit that I cannot fully appreciate the use cases of libraries lke tidyr or gt_summary. I occasionally encounter difficulties with manipulating tibbles, yet find it more comfortable to work with tidyr. This mix and match approach makes me feel as a script kiddie and may be the source of difficulties/errors in my code.
Checking so, I understand that this problem may have several solutions with base R, and other packages. I öay not appreciate that the solution requires use of multiple packages. So I expect answer(s) explaining the general approach, comparing some alternatives as required, and including some example code. Commenting on the inefficiencies/errors on presented code are welcome.
Thank you for reading so far.
Example as dput
An example df including one grouping variable group
, two parameters P1
and P2
is given below to work on:
structure(list(group = c(0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0,
0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0,
1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1), T0_P1 = c(0.354515,
0.400612, 0.22268, 0.356677, -0.36244, 0.0749541, 0.654674, 0.410779,
0.282992, 0.279816, 0.359564, 0.498958, 0.207068, 0.538793, -0.791641,
0.324721, 0.566456, 0.423489, 0.400608, -0.233297, 0.68386, -0.28549,
-0.291538, 0.516093, 0.143584, 0.231018, 0.468302, 0.0280513,
0.52974, 0.475668, -0.038977, 0.535264, 0.420121, 0.343218, 0.308243,
0.10415, 0.017395, 0.548448, 0.0740497, -1.0745, -0.087471, 0.294959,
0.282871, -2.08282, -2.68453, 0.283177, 0.200626, 0.0388438,
0.432551, -0.861437, -0.174149, -0.120327, 0.110401), T1_P1 = c(0.370035,
0.408102, 0.249126, 0.381831, -0.505082, 0.0521383, 0.624644,
0.374959, 0.172251, 0.140744, 0.34101, 0.424967, 0.174693, 0.115618,
-0.367998, 0.385345, 0.603818, 0.382661, 0.248948, 0.197852,
0.694853, -0.212129, -0.129636, 0.513091, 0.0707203, 0.285887,
0.478958, 0.632496, 0.44714, 0.485816, -0.103302, 0.492024, 0.46066,
0.365897, 0.290883, 0.0913257, -0.00347047, 0.3081, 0.161523,
-0.018373, -0.0656219, -0.326731, 0.204379, 0.0954256, 0.40172,
0.240711, 0.432768, 0.295718, 0.108598, 0.045233, -0.49056, -0.0479581,
0.228132), T1.T0.diff_P1 = c(4, 2, 12, 7, 39, -30, -5, -9, -39,
-50, -5, -15, -16, -79, -54, 19, 7, -10, -38, -185, 2, -26, -56,
-1, -51, 24, 2, 2155, -16, 2, 165, -8, 10, 7, -6, -12, -120,
-44, 118, -98, -25, -211, -28, -105, -115, -15, 116, 661, -75,
-105, 182, -60, 107), T2_P1 = c(0.373786, 0.395853, 0.134041,
0.378412, -0.415208, 0.143939, 0.582672, 0.386103, 0.215891,
-0.274945, 0.279119, 0.438118, 0.234042, 0.34044, -0.573318,
0.456852, 0.604797, 0.336034, 0.450904, 0.287317, 0.680436, -0.409743,
-0.0853578, 0.562901, 0.066638, 0.371619, 0.52145, 0.548569,
0.514376, 0.479195, 0.053544, 0.177107, 0.35743, 0.422869, 0.331834,
0.146896, -0.0388651, 0.36162, 0.12057, 0.00685142, -0.000308817,
-0.208294, 0.290837, 0.154187, 0.446681, 0.233519, 0.117889,
0.254841, 0.0637857, 0.143729, -0.206059, -0.0150699, -0.219658
), T2.T0.diff_P1 = c(5, -1, -40, 6, 15, 92, -11, -6, -24, -198,
-22, -12, 13, -37, -28, 41, 7, -21, 13, -223, -1, 44, -71, 9,
-54, 61, 11, 1856, -3, 1, -237, -67, -15, 23, 8, 41, -323, -34,
63, -101, -100, -171, 3, -107, -117, -18, -41, 556, -85, -117,
18, -87, -299), T2.T1.diff_P1 = c(1, -3, -46, -1, -18, 176,
-7, 3, 25, -295, -18, 3, 34, 194, 56, 19, 0, -12, 81, 45, -2,
93, -34, 10, -6, 30, 9, -13, 15, -1, -152, -64, -22, 16, 14,
61, 1020, 17, -25, -137, -100, -36, 42, 62, 11, -3, -73, -14,
-41, 218, -58, -69, -196), T3_P1 = c(-0.108073, -0.157872, 0.124176,
0.235193, 0.19941, -0.0932395, 0.264204, 0.0910636, -0.197725,
0.0419729, -0.393531, 0.116635, 0.121932, 0.585601, -0.412341,
0.238735, 0.380094, 0.249455, 0.339111, 0.00476125, 0.514552,
-0.703535, 0.345998, 0.110932, -0.7302, -0.113644, 0.369359,
0.266292, 0.526663, 0.0106685, -1.01393, -2.39016, -0.653869,
0.0979844, 0.0634867, 0.0336409, -0.284115, 0.205695, 0.144771,
0.0072521, -0.182956, -0.165441, 0.227927, 0.00188442, 0.4398,
0.160438, 0.451799, -0.476955, 0.075239, 0.088576, -0.347784,
0.136475, -0.213508), T3.T0.diff_P1 = c(-130, -139, -44, -34,
-155, -224, -60, -78, -170, -85, -209, -77, -41, 9, -48, -26,
-33, -41, -15, -102, -25, 146, -219, -79, -609, -149, -21, 849,
-1, -98, 2501, -547, -256, -71, -79, -68, -1733, -62, 96, -101,
109, -156, -19, -100, -116, -43, 125, -1328, -83, -110, 100,
-213, -293), T3.T1.diff_P1 = c(-129, -139, -50, -38, -139, -279,
-58, -76, -215, -70, -215, -73, -30, 406, 12, -38, -37, -35,
36, -98, -26, 232, -367, -78, -1133, -140, -23, -58, 18, -98,
882, -586, -242, -73, -78, -63, 8087, -33, -10, -139, 179, -49,
12, -98, 9, -33, 4, -261, -31, 96, -29, -385, -194), T4_P1 = c(-0.127339,
0.234353, 0.0615958, 0.174939, 0.249978, -0.428429, 0.276876,
0.238878, -0.0508291, -0.11582, -0.25588, 0.35472, 0.114993,
0.544118, -0.2078, -0.141583, 0.501062, -0.0577436, 0.449832,
-0.819736, 0.536002, -0.671793, 0.449196, 0.166851, -0.848327,
-0.148966, 0.608949, 0.299139, 0.358773, -0.122672, -0.789994,
-0.470871, -0.337386, 0.363851, 0.135002, -0.2105, -0.355363,
0.211093, 0.0403827, -0.401496, -0.163691, 0.161768, 0.0766178,
-0.19001, 0.44916, 0.22669, 0.400209, -0.221847, 0.409649, -0.626259,
-0.841326, 0.197649, -0.365613), T4.T0.diff_P1 = c(-136, -42,
-72, -51, -169, -672, -58, -42, -118, -141, -171, -29, -44, 1,
-74, -144, -12, -114, 12, 251, -22, 135, -254, -68, -691, -164,
30, 966, -32, -126, 1927, -188, -180, 6, -56, -302, -2143, -62,
-45, -63, 87, -45, -73, -91, -117, -20, 99, -671, -5, -27, 383,
-264, -431), T4.T1.diff_P1 = c(-134, -43, -75, -54, -149, -922,
-56, -36, -130, -182, -175, -17, -34, 371, -44, -137, -17, -115,
81, -514, -23, 217, -447, -67, -1300, -152, 27, -53, -20, -125,
665, -196, -173, -1, -54, -330, 10140, -31, -75, 2085, 149, -150,
-63, -299, 12, -6, -8, -175, 277, -1485, 72, -512, -260), T5_P1 = c(-0.1898,
0.278567, 0.191452, 0.242484, 0.188506, -0.254942, 0.123944,
0.121264, -0.058461, -0.0706581, -0.223103, 0.369762, 0.154631,
0.56503, -0.336153, -0.152756, 0.405927, -0.0312427, 0.102258,
-1.10958, 0.533744, -0.507479, -0.255318, 0.472308, -0.839511,
-0.164691, 0.511855, -0.0619669, 0.385496, -0.0389004, -0.143956,
0.0928928, -0.436918, 0.419153, 0.234921, -0.161249, -0.236711,
0.121568, 0.109438, -0.150289, -0.0742779, 0.332437, 0.274052,
-0.285785, 0.492551, 0.304347, 0.473602, -0.091301, -0.0833264,
-0.492741, 0.0980397, 0.28639, -0.122452), T5.T0.diff_P1 = c(-154,
-30, -14, -32, -152, -440, -81, -70, -121, -125, -162, -26, -25,
5, -58, -147, -28, -107, -74, 376, -22, 78, -12, -8, -685, -171,
9, -321, -27, -108, 269, -83, -204, 22, -24, -255, -1461, -78,
48, -86, -15, 13, -3, -86, -118, 7, 136, -335, -119, -43, -156,
-338, -211), T5.T1.diff_P1 = c(-151, -32, -23, -36, -137, -589,
-80, -68, -134, -150, -165, -13, -11, 389, -9, -140, -33, -108,
-59, -661, -23, 139, 97, -8, -1287, -158, 7, -110, -14, -108,
39, -81, -195, 15, -19, -277, 6721, -61, -32, 718, 13, -202,
34, -399, 23, 26, 9, -131, -177, -1189, -120, -697, -154), T6_P1 = c(0.0704855,
0.126056, -0.041206, -0.50256, -0.0191214, 0.0130739, -0.0660002,
-0.352739, -0.404444, -0.00161666, 0.0881695, 0.501725, 0.182178,
0.201283, -0.321061, -0.496934, 0.430376, 0.18756, 0.469647,
-0.297791, 0.325876, -0.949504, 0.202842, 0.451688, -0.5095,
0.111856, -0.129408, 0.357865, 0.292238, 0.143681, -0.465477,
-0.447798, -0.166457, 0.170199, -0.106613, 0.11888, -0.0224689,
0.290059, -1.02956, 0.0534638, -0.876711, 0.0234375, -0.208117,
0.121276, -0.10558, -3.63428, -0.178923, -0.27206, -1.187, -0.0671986,
-0.185629, -0.102337, 0.135412), T0_P2 = c(23.4767, 80.6425,
27.6929, 36.3399, 0.222854, 15.4059, 16.3445, 10.3642, 60.1418,
10.7158, 73.6641, 74.3803, 82.1072, 28.709, 19.5041, 22.0872,
24.0681, 10.1122, 17.4177, 17.5524, 23.7382, 14.7304, 94.7995,
16.6482, 30.4161, 23.0676, 14.3303, 47.4505, 34.8889, 12.017,
17.7335, 0, 21.6478, 13.8952, 12.9003, 17.1954, 22.6989, 99.8425,
20.3926, 4.82337, 29.8876, 15.7117, 18.5338, 33.1752, 14.0226,
15.0854, 8.91286, 26.1874, 23.1092, 35.0972, 5.55667, 11.1803,
27.9371), T1_P2 = c(16.1512, 16.0162, 25.7443, 25.5485, 20.5682,
19.2128, 13.2216, 60.5308, 50.5352, 17.9931, 10.7115, 93.1466,
77.2653, 13.0037, 20.4135, 10.9121, 22.838, 75.1338, 31.0509,
15.826, 84.0741, 13.0837, 19.1441, 61.3631, 13.0631, 32.1478,
83.8708, 0, 10.8444, 40.9282, 18.9858, 0, 97.1279, 10.1327, 61.5105,
23.271, 21.4968, 75.2181, 15.4974, 13.6953, 10.9891, 17.2444,
14.447, 25.7276, 11.0123, 7.84029, 12.0358, 14.6238, 26.1094,
19.4256, 4.30366, 4.01792, 9.30375), T1.T0.diff_P2 = c(-31,
-80, -7, -30, 9129, 25, -19, 484, -16, 68, -85, 25, -6, -55,
5, -51, -5, 643, 78, -10, 254, -11, -80, 269, -57, 39, 485, -100,
-69, 241, 7, NaN, 349, -27, 377, 35, -5, -25, -24, 184, -63,
10, -22, -22, -21, -48, 35, -44, 13, -45, -23, -64, -67), T2_P2 = c(17.6024,
13.7435, 31.066, 36.6922, 17.9928, 23.0723, 61.4986, 90.0946,
49.7064, 18.4268, 93.6249, 64.6212, 66.0091, 51.8095, 85.8795,
17.8472, 20.5687, 10.1429, 17.3647, 24.6992, 20.6315, 57.3102,
72.0594, 64.146, 10.0122, 80.8444, 12.5641, 91.9954, 54.9446,
43.0989, 17.8794, 13.4608, 11.6372, 58.2568, 38.3219, 22.9037,
11.2806, 15.4506, 10.765, 12.843, 9.43461, 5.55588, 10.5876,
25.9979, 20.3327, 9.23773, 10.9801, 82.6116, 14.2446, 25.8306,
10.359, 7.52712, 18.3046), T2.T0.diff_P2 = c(-25, -83, 12, 1,
7974, 50, 276, 769, -17, 72, 27, -13, -20, 80, 340, -19, -15,
0, 0, 41, -13, 289, -24, 285, -67, 250, -12, 94, 57, 259, 1,
Inf, -46, 319, 197, 33, -50, -85, -47, 166, -68, -65, -43, -22,
45, -39, 23, 215, -38, -26, 86, -33, -34), T2.T1.diff_P2 = c(9,
-14, 21, 44, -13, 20, 365, 49, -2, 2, 774, -31, -15, 298, 321,
64, -10, -87, -44, 56, -75, 338, 276, 5, -23, 151, -85, Inf,
407, 5, -6, Inf, -88, 475, -38, -2, -48, -79, -31, -6, -14, -68,
-27, 1, 85, 18, -9, 465, -45, 33, 141, 87, 97), T3_P2 = c(68.731,
12.7383, 25.7426, 30.4209, 95.1241, 30.5427, 69.711, 95.4164,
66.3215, 12.0129, 79.4669, 13.884, 11.3739, 39.1618, 31.8607,
11.3097, 10.465, 45.7652, 10.2818, 25.492, 35.3997, 44.1852,
28.2852, 11.1805, 69.4717, 77.7511, 10.573, 14.9569, 86.5404,
34.5778, 30.6023, 89.6231, 16.3404, 32.2442, 67.5567, 15.6465,
17.3722, 30.3812, 10.2457, 4.68845, 13.1288, 10.1949, 8.64307,
5.27611, 22.0138, 10.4523, 7.4947, 77.573, 14.8642, 19.6778,
15.5668, 3.56341, 22.285), T3.T0.diff_P2 = c(193, -84, -7, -16,
42584, 98, 327, 821, 10, 12, 8, -81, -86, 36, 63, -49, -57, 353,
-41, 45, 49, 200, -70, -33, 128, 237, -26, -68, 148, 188, 73,
Inf, -25, 132, 424, -9, -23, -70, -50, -3, -56, -35, -53, -84,
57, -31, -16, 196, -36, -44, 180, -68, -20), T3.T1.diff_P2 = c(326,
-20, 0, 19, 362, 59, 427, 58, 31, -33, 642, -85, -85, 201, 56,
4, -54, -39, -67, 61, -58, 238, 48, -82, 432, 142, -87, Inf,
698, -16, 61, Inf, -83, 218, 10, -33, -19, -60, -34, -66, 19,
-41, -40, -79, 100, 33, -38, 430, -43, 1, 262, -11, 140), T4_P2 = c(14.0407,
86.8672, 11.4711, 30.0838, 67.8631, 81.9849, 72.5374, 83.0472,
65.794, 85.05, 10.5096, 74.2366, 10.8808, 53.6499, 13.5859, 99.9067,
12.1605, 13.9899, 40.1762, 20.1221, 16.0165, 38.9638, 11.5921,
75.9692, 10.5741, 13.1628, 67.1616, 11.6427, 10.1465, 10.3222,
48.7381, 77.0429, 32.6667, 16.3706, 11.0336, 11.8518, 18.4726,
10.0398, 3.48196, 8.6861, 6.32977, 18.1642, 18.7497, 12.7896,
17.9789, 15.0961, 10.5745, 19.8159, 22.8705, 5.8275, 16.8047,
3.72111, 8.86339), T4.T0.diff_P2 = c(-40, 8, -59, -17, 30352,
432, 344, 701, 9, 694, -86, 0, -87, 87, -30, 352, -49, 38, 131,
15, -33, 165, -88, 356, -65, -43, 369, -75, -71, -14, 175, Inf,
51, 18, -14, -31, -19, -90, -83, 80, -79, 16, 1, -61, 28, 0,
19, -24, -1, -83, 202, -67, -68), T4.T1.diff_P2 = c(-13, 442,
-55, 18, 230, 327, 449, 37, 30, 373, -2, -20, -86, 313, -33,
816, -47, -81, 29, 27, -81, 198, -39, 24, -19, -59, -20, Inf,
-6, -75, 157, Inf, -66, 62, -82, -49, -14, -87, -78, -37, -42,
5, 30, -50, 63, 93, -12, 36, -12, -70, 290, -7, -5), T5_P2 = c(10.5461,
29.1796, 88.1755, 38.6234, 22.0569, 10.001, 13.4991, 14.8506,
19.519, 95.617, 17.8181, 84.2492, 10.7215, 88.9674, 76.4191,
45.19, 10.8317, 31.8107, 24.7983, 55.9251, 16.4832, 43.3698,
22.1386, 18.002, 79.2035, 12.6423, 85.4316, 32.1691, 75.6754,
53.5464, 17.2313, 15.1174, 59.3047, 12.0834, 82.7295, 81.1334,
10.7536, 30.5454, 12.4223, 9.50655, 5.0372, 8.15472, 10.8593,
6.14378, 5.62171, 6.21806, 4.05756, 12.0619, 21.1074, 12.9207,
13.4733, 32.445, 20.8236), T5.T0.diff_P2 = c(-55, -64, 218,
6, 9797, -35, -17, 43, -68, 792, -76, 13, -87, 210, 292, 105,
-55, 215, 42, 219, -31, 194, -77, 8, 160, -45, 496, -32, 117,
346, -3, Inf, 174, -13, 541, 372, -53, -69, -39, 97, -83, -48,
-41, -81, -60, -59, -54, -54, -9, -63, 142, 190, -25), T5.T1.diff_P2 = c(-35,
82, 243, 51, 7, -48, 2, -75, -61, 431, 66, -10, -86, 584, 274,
314, -53, -58, -20, 253, -80, 231, 16, -71, 506, -61, 2, Inf,
598, 31, -9, Inf, -39, 19, 34, 249, -50, -59, -20, -31, -54,
-53, -25, -76, -49, -21, -66, -18, -19, -33, 213, 708, 124),
T6_P2 = c(19.4431, 18.3498, 36.6816, 20.8788, 20.7658, 22.2604,
60.7024, 10.4609, 98.5593, 62.8739, 71.6437, 22.755, 97.5644,
98.5916, 93.0426, 90.9254, 72.4049, 86.6857, 13.1171, 28.2106,
28.0355, 54.2341, 10.2572, 12.291, 56.9864, 10.5199, 68.7879,
32.782, 61.1115, 16.1887, 29.1698, 66.4892, 23.1343, 10.191,
20.0709, 38.6514, 91.7475, 91.9235, 5.50988, 22.1358, 22.3813,
12.6712, 8.06821, 22.8411, 15.0756, 6.79443, 26.9108, 18.2863,
9.69334, 4.18761, 22.297, 24.9186, 48.9786)), class = "data.frame", row.names = c(NA,
53L))
To use gtsummary, you'll need your data in long format with one line per person per timepoint. See below
library(gtsummary)
library(tidyverse)
# get data into one line per person, per timepoint
df_restrucutre <-
df |>
select(group, matches("^T[0-9]_")) |>
mutate(id = row_number()) |>
pivot_longer(
cols = -c(id, group),
names_to = "time_param"
) |>
mutate(
timepoint = word(time_param, 1L, sep = "_"),
param = word(time_param, 2L, sep = "_")
) |>
pivot_wider(
id_cols = c(id, group, timepoint),
names_from = param,
values_from = value
)
df_restrucutre |>
tbl_summary(
by = timepoint,
include = starts_with("P"),
statistic = all_continuous() ~ "{mean} ± {sd}",
digits = all_continuous() ~ 2
) |>
# convert to kable so the table will display on stackoverflow
as_kable()
Characteristic | T0, N = 53 | T1, N = 53 | T2, N = 53 | T3, N = 53 | T4, N = 53 | T5, N = 53 | T6, N = 53 |
---|---|---|---|---|---|---|---|
P1 | 0.08 ± 0.62 | 0.21 ± 0.28 | 0.21 ± 0.28 | -0.03 ± 0.47 | -0.01 ± 0.39 | 0.02 ± 0.34 | -0.15 ± 0.61 |
P2 | 27.52 ± 23.28 | 28.46 ± 25.74 | 33.66 ± 26.67 | 32.91 ± 27.28 | 30.74 ± 28.08 | 32.40 ± 28.18 | 38.48 ± 29.91 |
Created on 2023-04-23 with reprex v2.0.2