Search code examples
rreshape

Reshaping data frame in R: from wide to long, but the 'varying' columns have unequal length


My question is described in the code below. I have looked here and in other forums for similar problems, but haven't found a solution that quite matches what I'm asking here. If it can be solved relying only on basic R, that would be preferable, but using a package is fine too.

id1 <- c("A", "A", "A", "B", "B", "C", "C", "C")
id2 <- c(10, 20, 30, 10, 30, 10, 20, 30)
x.1 <- ceiling(runif(8)*80) + 20
y.1 <- ceiling(runif(8)*15) + 200
x.2 <- ceiling(runif(8)*90) + 20
y.2 <- ceiling(runif(8)*20) + 200
x.3 <- ceiling(runif(8)*80) + 40

# The data frame contains to kinds of data values, x and y, repeated by a suffix number. In my example both
# the id-part and the data-part are not structured in a completely uniform manner.
mywidedata <- data.frame(id1, id2, x.1, y.1, x.2, y.2, x.3)

# If I wanted to make the data frame even wider, this would work. It generates NAs for the missing combination (B,20).
reshape(mywidedata, idvar = "id1", timevar = "id2", direction = "wide")

# What I want is "long", and this fails.
reshape(mywidedata, varying = c(3:7), direction = "long")

# I could introduce the needed column. This works.
mywidecopy <- mywidedata
mywidecopy$y.3 <- NA
mylongdata <- reshape(mywidecopy, idvar=c(1,2), varying = c(3:8), direction = "long", sep = ".")
# (sep-argument not needed in this case - the function can figure out the system)
names(mylongdata)[(names(mylongdata)=="time")] <- "id3"

# I want to reach the same outcome without manual manipulation. Is it possible with the just the
# built-in 'reshape'?

# Trying 'melt'. Not what I want.
reshape::melt(mywidedata, id.vars = c(1,2))

Solution

  • You can use pivot_longer from tidyr :

    tidyr::pivot_longer(mywidedata, 
                        cols = -c(id1, id2), 
                        names_to = c('.value', 'id3'), 
                        names_sep = '\\.')
    
    # A tibble: 24 x 5
    #   id1     id2 id3       x     y
    #   <chr> <dbl> <chr> <dbl> <dbl>
    # 1 A        10 1        66   208
    # 2 A        10 2        95   220
    # 3 A        10 3        89    NA
    # 4 A        20 1        34   208
    # 5 A        20 2        81   219
    # 6 A        20 3        82    NA
    # 7 A        30 1        23   201
    # 8 A        30 2        80   204
    # 9 A        30 3        75    NA
    #10 B        10 1        52   210
    # … with 14 more rows