Search code examples
rdataframereplacediagonal

Remove a specific duplicate values and keep diagonal


I have a dataset like this enter image description here

I want to separate "money", "income" and "loan" on separates rows. I mean, I want to get this :

enter image description here

My first thought was : replicate the row and then remove extra value to get the desire result. I got this : enter image description here

But I'm stuck to remove value in red. How can we do that ?

This is my code :

mydata = data.frame(
  name = "Vince",
  date = "16/05/1977",
  money = 20,
  city = "NY",
  income = 100,
  country = "USA",
  car = "Porsche",
  loan = 250
)

duplicated_data = do.call("rbind", replicate(3, mydata, simplify = FALSE))

Some help would be appreciated


Solution

  • In base R, convert to matrix and then replace non diagonal values by NAs

    cols = c("money", "income", "loan")
    mat <- as.matrix(duplicated_data[cols])
    duplicated_data[cols] <- ifelse(row(mat) == col(mat), mat, NA) #Replace with NA
    

    or that could be even simpler if you're okay with having 0 instead of NAs:

    duplicated_data[cols] <- diag(diag(mat)) #Replace with 0
    

    output

       name       date money city income country     car loan
    1 Vince 16/05/1977    20   NY     NA     USA Porsche   NA
    2 Vince 16/05/1977    NA   NY    100     USA Porsche   NA
    3 Vince 16/05/1977    NA   NY     NA     USA Porsche  250