I would like to import an existing format catalog (the numerical ones for the start) from SAS into R. In SAS formats are organized as a starting value to and end value. All integer-values between should be transformed to the target-value (which can be a number or a character).
To use this in R I have to create a new variable which is the result of the sequences of two numerical variables.
An example dataset looks like this:
fname <- c("aformat", "aformat", "aformat", "aformat", "aformat")
svalue <- c(1, 10, 15, 16, 30)
evalue <- c(2, 13, 15, 16, 35)
gvalue <- c("Berne", "Lucerne", "Zurich", "Chur", "Basel")
df <- data.frame(fname, svalue, evalue, gvalue)
I tried to solve the problem with the tidyR-Functions "complete" and "fill", but this also fills the gaps between the sequences of the two variables (for example the value 14 which lies between the sequence of 10:13 and 15:15).
Failed try with complete and fill
output <- df %>%
group_by(fname) %>%
complete(evalue = full_seq(min(svalue):max(evalue), 1)) %>%
select(fname, evalue, gvalue) %>%
fill(gvalue, .direction = "up")
You can use reframe
to "expand" rows of a dataframe.
library(dplyr)
reframe(df, evalue = seq(svalue, evalue), .by = c(fname, gvalue))
fname gvalue evalue
1 aformat Berne 1
2 aformat Berne 2
3 aformat Lucerne 10
4 aformat Lucerne 11
5 aformat Lucerne 12
6 aformat Lucerne 13
7 aformat Zurich 15
8 aformat Chur 16
9 aformat Basel 30
10 aformat Basel 31
11 aformat Basel 32
12 aformat Basel 33
13 aformat Basel 34
14 aformat Basel 35