Search code examples
rtidyversetidyrreshape2

Reshape dataframe so that each entry of column in repeated over all other columns


I have a data.frame, dat, which looks like this

dat = data.frame(x = c(1, 1.1, 1.2, 1.3), y = c(2, 2.1, 2.2, 2.3), output = c(2, 10, 101, 100))
    x   y output
1 1.0 2.0      2
2 1.1 2.1     10
3 1.2 2.2    101
4 1.3 2.3    100

I want that each pair of elements of columns "x" and "output" is repeated over column "y".

I have tried using tidyr::spread, tidyr::gather, and reshape2::melt to no avail. This is because I am a beginner in using tidyr and reshape2 and other reshaping packages.

Currently, I have resorted to using a loop to extract each element pair from columns "x" and "output" and creating a new data.frame, final_df, that combines the resulting data.frames. I believe this is definitely not the most efficient way to do it and am confident that there is a one-liner function somewhere which can do this magic for me.

In the resulting data.frame, if I subset the data.frame using say,

dplyr::filter(final_df, x == 1, output == 2)

it should look like so:

data.frame(x = rep(1, dat$x[1], nrow(dat)), y = dat$y, output = rep(dat$output[1], nrow(dat)))
  x   y output
1 1 2.0      2
2 1 2.1      2
3 1 2.2      2
4 1 2.3      2

I will be happy with an answer using tidyverse. Thank you.


Solution

  • Here is one option

    library(dplyr)
    library(tidyr)
    dat %>% mutate(y1=paste(y,collapse = ',')) %>% separate_rows(y1)
    

    If there is no duplication in x and output i.e we can treat them as an ID column then we can use tidyr::complete

    dat %>% complete(nesting(x,output),y)