I must be missing something simple, but for the life of me I cannot identify why I am not able to reproduce the behavior of tidytext::reorder_within()
on my own data.
All works fine for the iris data. For my data, the expected output is that Supercross lap times should be ordered by median rider lap time within each facet of event_city. Specifically, Sexton should be in the top row for both Anaheim 1 and Arlington because he had the higher median lap time for both races. But as you can see, Lawrence is incorrectly in the top row for Arlington.
Can anyone see how my code or data are different from the iris data, or where my code might be wrong?
library(tidyverse)
library(tidytext)
library(ggpubr)
iris %>%
pivot_longer(cols = -Species, names_to = "flower_attribute") %>%
ggplot(aes(x = reorder_within(Species, by = value, within = flower_attribute, fun = median), y = value)) +
geom_point(stat = "summary") +
geom_errorbar(stat = "summary", fun.data = "median_mad", width = 0.1) +
geom_line(aes(group = 1), stat = "summary", fun.y = "median_mad") +
scale_x_reordered() +
coord_flip() +
facet_wrap(~ flower_attribute, scales = "free") +
labs(x = "Species") +
theme_minimal()
#> Warning in geom_line(aes(group = 1), stat = "summary", fun.y = "median_mad"):
#> Ignoring unknown parameters: `fun.y`
#> No summary function supplied, defaulting to `mean_se()`
#> No summary function supplied, defaulting to `mean_se()`
#> No summary function supplied, defaulting to `mean_se()`
#> No summary function supplied, defaulting to `mean_se()`
#> No summary function supplied, defaulting to `mean_se()`
#> No summary function supplied, defaulting to `mean_se()`
#> No summary function supplied, defaulting to `mean_se()`
#> No summary function supplied, defaulting to `mean_se()`
plot_dat <- tibble::tribble(
~race_lap, ~race_type, ~event_track, ~rider_number, ~rider_name, ~rider_lap_time,
17L, "Main Event", "Arlington", 21L, "Anderson", 0.788016667,
2L, "Main Event", "Arlington", 21L, "Anderson", 0.75555,
5L, "Main Event", "Arlington", 96L, "Lawrence", 0.781033333,
14L, "Main Event", "Daytona", 21L, "Anderson", 1.99495,
3L, "Heat 2", "Arlington", 21L, "Anderson", 0.7576,
7L, "Main Event", "Daytona", 18L, "Lawrence", 1.502983333,
1L, "Main Event", "Daytona", 3L, "Tomac", 1.54585,
5L, "Main Event", "Arlington", 3L, "Tomac", 0.7759,
5L, "Heat 1", "Arlington", 1L, "Sexton", 0.742483333,
5L, "Main Event", "Anaheim 1", 3L, "Tomac", 1.05025,
3L, "Heat 2", "Anaheim 1", 2L, "Webb", 1.036916667,
3L, "Main Event", "Anaheim 1", 21L, "Anderson", 1.046483333,
2L, "Main Event", "Arlington", 3L, "Tomac", 0.781366667,
10L, "Main Event", "Daytona", 21L, "Anderson", 1.62525,
12L, "Main Event", "Anaheim 1", 1L, "Sexton", 1.0647,
5L, "Heat 1", "Anaheim 1", 96L, "Lawrence", 1.108883333,
4L, "Heat 1", "Arlington", 1L, "Sexton", 0.75045,
6L, "Heat 2", "Arlington", 2L, "Webb", 0.781883333,
2L, "Heat 2", "Anaheim 1", 2L, "Webb", 1.03985,
15L, "Main Event", "Arlington", 3L, "Tomac", 0.7576,
3L, "Heat 1", "Anaheim 1", 96L, "Lawrence", 1.085033333,
3L, "Main Event", "Anaheim 1", 2L, "Webb", 1.046666667,
20L, "Main Event", "Arlington", 96L, "Lawrence", 0.790683333,
2L, "Heat 1", "Anaheim 1", 96L, "Lawrence", 1.084433333,
5L, "Main Event", "Arlington", 18L, "Lawrence", 0.76185,
11L, "Main Event", "Daytona", 2L, "Webb", 1.594066667,
4L, "Main Event", "Daytona", 18L, "Lawrence", 1.507066667,
8L, "Heat 1", "Arlington", 3L, "Tomac", 0.7554,
6L, "Heat 2", "Arlington", 18L, "Lawrence", 0.756916667,
18L, "Main Event", "Anaheim 1", 1L, "Sexton", 1.10345,
14L, "Main Event", "Arlington", 3L, "Tomac", 0.758866667,
16L, "Main Event", "Arlington", 1L, "Sexton", 0.768916667,
3L, "Main Event", "Arlington", 2L, "Webb", 0.76595,
9L, "Heat 1", "Arlington", 1L, "Sexton", 0.775583333,
9L, "Main Event", "Daytona", 1L, "Sexton", 1.5748,
1L, "Main Event", "Arlington", 2L, "Webb", 0.7579,
4L, "Main Event", "Anaheim 1", 3L, "Tomac", 1.062833333,
24L, "Main Event", "Arlington", 18L, "Lawrence", 0.782866667,
11L, "Main Event", "Daytona", 96L, "Lawrence", NA,
17L, "Main Event", "Arlington", 2L, "Webb", 0.78965
)
plot_dat %>%
ggplot(aes(x = reorder_within(rider_name, by = rider_lap_time, within = event_track, fun = median), y = rider_lap_time)) +
geom_point(stat = "summary", fun.data = "median_mad") +
geom_errorbar(stat = "summary", fun.data = "median_mad", width = 0.1) +
geom_line(stat = "summary", fun.data = "median_mad", aes(group = 1)) +
scale_x_reordered() +
coord_flip() +
facet_wrap(~ event_track, scales = "free") +
labs(x = "Rider", y = "Lap Time (Minutes)") +
theme_minimal()
#> Warning: Removed 1 row containing non-finite outside the scale range
#> (`stat_summary()`).
#> Warning: Removed 1 row containing non-finite outside the scale range (`stat_summary()`).
#> Removed 1 row containing non-finite outside the scale range (`stat_summary()`).
Created on 2024-12-23 with reprex v2.1.1
You have 1 missing value in rider_lap_time
for Daytona/Lawrence. For this case, the median
function returns NA, and it will be placed last in the order.
To ignore missing values, you need to add na.rm = T
argument in the median
function.
At the same time you don't need to worry about the median_mad
function from ggpubr
- it ignores missing values by design.
plot_dat %>%
ggplot(aes(x = reorder_within(rider_name, by = rider_lap_time,
within = event_track,
fun = \(x)median(x, na.rm = T)),
y = rider_lap_time)) +
geom_point(stat = "summary", fun.data = "median_mad") +
geom_errorbar(stat = "summary", fun.data = "median_mad", width = 0.1) +
geom_line(stat = "summary", fun.data = "median_mad", aes(group = 1)) +
scale_x_reordered() +
coord_flip() +
facet_wrap(~ event_track, scales = "free") +
labs(x = "Rider", y = "Lap Time (Minutes)") +
theme_minimal()
#> Warning: Removed 1 row containing non-finite outside the scale range (`stat_summary()`).
#> Removed 1 row containing non-finite outside the scale range (`stat_summary()`).
#> Removed 1 row containing non-finite outside the scale range (`stat_summary()`).
Created on 2024-12-23 with reprex v2.0.2