The head of my dataframe is as follows:
ranges n
1 (0,1e+04] 13
2 (1e+04,2e+04] 11
3 (2e+04,3e+04] 21
4 (3e+04,4e+04] 14
5 (4e+04,5e+04] 9
6 (5e+04,6e+04] 8
7 (6e+04,7e+04] 13
8 (7e+04,8e+04] 11
9 (8e+04,9e+04] 16
10 (9e+04,1e+05] 16
This shows ranges of specific data and the number of counts per each of the groups.
I'd like to create a new column between the two existing ones. It should consist of values representing the middle point of each group range( i.e., 5000 for the first group, 15,000 for the second and so on.
So far, I've managed to add a new column with the command:
add_column(Position = "Value",
.after="ranges")
The results are:
ranges Position n
1 (0,1e+04] Value 13
2 (1e+04,2e+04] Value 11
3 (2e+04,3e+04] Value 21
4 (3e+04,4e+04] Value 14
5 (4e+04,5e+04] Value 9
6 (5e+04,6e+04] Value 8
7 (6e+04,7e+04] Value 13
8 (7e+04,8e+04] Value 11
9 (8e+04,9e+04] Value 16
10 (9e+04,1e+05] Value 16
I'm still not sure how to add the middle value of each groups' range instead of just "Value" using this command.
Any suggestions on how to do this?
mydf <- data.frame(
ranges =
c("(0,1e+04]",
"(1e+04,2e+04]",
"(2e+04,3e+04]",
"(3e+04,4e+04]",
"(4e+04,5e+04]",
"(5e+04,6e+04]",
"(6e+04,7e+04]",
"(7e+04,8e+04]",
"(8e+04,9e+04]",
"(9e+04,1e+05]")
)
library(dplyr)
library(stringr)
library(tidyr)
library(readr)
mydf |>
separate(ranges, into = c("min", "max"), sep = ",") |>
mutate(min = str_remove(min, "\\("),
max = str_remove(max, "\\]"),
across(c(min, max), parse_number)) |>
rowwise() |>
mutate(value = mean(c_across(c(min, max)))) |>
ungroup()
# A tibble: 10 × 3
min max value
<dbl> <dbl> <dbl>
1 0 10000 5000
2 10000 20000 15000
3 20000 30000 25000
4 30000 40000 35000
5 40000 50000 45000
6 50000 60000 55000
7 60000 70000 65000
8 70000 80000 75000
9 80000 90000 85000
10 90000 100000 95000