I tried merge
and a all series of for/if
loops of which the best I will report.
I read several posts but I could not find any that does quite match.
I have 2 databases, one of 360 rows and the other one of 60 rows. I would like to add some columns present in the smaller one to the bigger one by four condition repeating the same number by another condition so to have a 360 rows dataset.
familiarity pb_type sex trial lower upper fit
mate tet m 1 1.760949 3.780915 2.809002
familiar tet m 1 2.020926 3.986183 3.021357
unfamiliar tet m 1 2.570472 4.499613 3.530639
mate stack m 1 3.479230 5.441066 4.500652
familiar stack m 1 2.934518 4.89067 3.904378
"familiarty", "pb_type", "sex" and "trial" are my conditions to select the rows and creates uniques combinations.
I would like to add the other columns "lower", "upper", and "fit" to my bigger dataset. Each of this row has to be repeated 6 times following the condition "id" that in my bigger database has
I cannot use rep
or so because the order of the conditions in different in the 2 dataset (e.g. in the familiarity column "mate", does not comes first in both)
Here is what I tried: the big dataset is "raw data", the small is "simulation"
max_count <- length(raw_data[,1])
count = 1
raw_data$lower <- NA
raw_data$upper <- NA
raw_data$mean <- NA
for(i in 1:length(simulation[,1])){
if(count<=max_count)
{
j<-count
while(raw_data[j,3] == simulation[i,3] && raw_data[j,4] == simulation[i,4]&& raw_data[j,7] == simulation[i,2] && raw_data[j,8] == simulation[i,1]){
raw_data$lower[[j]] <- simulation$lower[[i]]
raw_data$upper[[j]] <- simulation$upper[[i]]
raw_data$mean[[j]] <- simulation$fit[[i]]
}
count <-count+1
}
}
Unfortunately it goes into a infinite loop always at the same point, I think because of the different order ot the conditions.
Unfortunately I am not good with the package dplyr
...that might be the solution.
I realize that the question is long and complicated, please help me in refine it!
thanks for any input all the best
If I'm understanding your question correctly, you want to join using combinations of the first four variables of the data table you've shown as the key? Please clarify if this is not the case, and it might help to see the other data table you are trying to merge.
That said, is this what you want?
library(dplyr)
left_join(raw_data, simulation, by = c("familiarity","pb_type","sex","trial"))
It may not be necessary to specify the join variables depending on what your other data table looks like, but it can't hurt.