I have similar data to the following (but larger one):
example <- rbind(data.frame(species = "A", trait1 = "yes", trait2 = NA),
data.frame(species = "A", trait1 = NA, trait2 = "yes"),
data.frame(species = "B", trait1 = NA, trait2 = "no"),
data.frame(species = "B", trait1 = "yes", trait2 = NA),
data.frame(species = "B", trait1 = "no", trait2 = NA),
data.frame(species = "B", trait1 = "no", trait2 = NA),
data.frame(species = "C", trait1 = NA, trait2 = "no"),
data.frame(species = "C", trait1 = "no", trait2 = NA),
data.frame(species = "D", trait1 = "yes", trait2 = NA),
data.frame(species = "D", trait1 = NA, trait2 = "yes"),
data.frame(species = "E", trait1 = NA, trait2 = "no"),
data.frame(species = "E", trait1 = "no", trait2 = NA),
data.frame(species = "E", trait1 = "no", trait2 = NA))
Here, trait2 is a fixed value (1 value for 1 species), but trait1 is variable within species. For each trait value, data for species come from different rows. After data management, I want to keep the variability existed in trait1, and this seems to make the process a bit complex.
Finally, I want to transform this data frame in R to the following one:
ex.res <- rbind(data.frame(species = "A", trait1 = "yes", trait2 = "yes"),
data.frame(species = "B", trait1 = "yes", trait2 = "no"),
data.frame(species = "B", trait1 = "no", trait2 = "no"),
data.frame(species = "B", trait1 = "no", trait2 = "no"),
data.frame(species = "C", trait1 = "no", trait2 = "no"),
data.frame(species = "D", trait1 = "yes", trait2 = "yes"),
data.frame(species = "E", trait1 = "no", trait2 = "no"),
data.frame(species = "E", trait1 = "no", trait2 = "no"))
I have tried many things, including some basic data management tools in R, and also duplicated
, unique
, and match_df
functions, but they were not successful in doing this completely.
Maybe a combined version of such functions could have been worked, but I am not able to do this. Is there any simple way to do it?
An approach by using base R,
first_part <- example[!is.na(example$trait1),]
second_part <- example[!is.na(example$trait2),]
merge(first_part[,-3], second_part[,-2], by="species")
species trait1 trait2
1 A yes yes
2 B yes no
3 B no no
4 B no no
5 C no no
6 D yes yes
7 E no no
8 E no no