I'm trying to build a histogram
from my data. It's look like this: a data frame
where in each row a data range. I need to get the histogram of all values in my df
.
year <- c("1925:2002",
"2008",
"1925:2002",
"1925:2002",
"1925:2002",
"2008:2013",
"1934",
"1972:1988")
All I was able to figure out is to convert every string to a sequence with seq()
but it doesn't work properly
for (i in 1:length(year)) {
rr[i] <- seq(
as.numeric(unlist(strsplit(year[i], ":"))[1]),
as.numeric(unlist(strsplit(year[i], ":"))[2])
)
}
Tick the answer box for @MrFlick. I had done this at the same time and the only difference is the piping:
library(magrittr)
strsplit(year, ":") %>%
lapply(as.integer) %>%
lapply(function(x) seq(x[1], x[length(x)])) %>%
unlist() %>%
hist()
Full-on tidyverse
:
library(tidyverse)
str_split(year, ":") %>%
map(as.integer) %>%
map(~seq(.x[1], .x[length(.x)])) %>%
flatten_int() %>%
hist()
To defend my comments hence any tidyverse
4eva folks join in the fray:
library(tidyverse)
library(microbenchmark)
microbenchmark(
base = as.integer(
unlist(
lapply(
lapply(
strsplit(year, ":"),
as.integer
),
function(x) seq(x[1], x[length(x)])
),
use.names = FALSE
)
),
tidy = str_split(year, ":") %>%
map(as.integer) %>%
map(~seq(.x[1], .x[length(.x)])) %>%
flatten_int()
)
## Unit: microseconds
## expr min lq mean median uq max neval
## base 89.099 96.699 132.1684 102.5895 110.7165 2895.428 100
## tidy 631.817 647.812 672.5904 667.8250 686.2740 909.531 100