I have two data frames of equal number of rows (639) but differing column lengths (DF1: 5, DF2: 2500), the rows in DF1 correspond to the rows in DF2.
Some rows in DF2 will be removed due to several NAs
, but I have no information on which ones are removed. cbind()
does not allow me to bind the DFs together due to the now differing row lengths. However, I also need the rows to correspond, so if row 47 is removed in DF2, it must also be removed in DF1 upon merge. My assumption is that there can be some workaround with row.names
but I am not sure how to execute it. Help would be appreciated. Examples of DFs below:
DF1:
pp trialNo item trialTarget trial
1 pp01 1 M012 script 1
2 pp01 2 BS016 script 2
3 pp01 3 M007 script 3
4 pp01 4 BS010 script 4
5 pp01 5 M006 script 5
6 pp01 6 BS018 script 6
DF2:
V1 V2 V3 V4 V5 V6
1: 764 764 763 763 762 763
2: 714 714 711 708 705 704
3: 872 871 869 867 867 871
4: 730 728 727 724 722 719
5: 789 786 788 790 792 790
6: 922 923 928 933 938 938
And assuming row 3 in DF2 is removed, I would expect this after binding:
pp trialNo item trialTarget trial V1 V2 V3 V4 V5 V6
1 pp01 1 M012 script 1 764 764 763 763 762 763
2 pp01 2 BS016 script 2 714 714 711 708 705 704
4 pp01 4 BS010 script 4 730 728 727 724 722 719
5 pp01 5 M006 script 5 789 786 788 790 792 790
6 pp01 6 BS018 script 6 922 923 928 933 938 938
Thanks in advance.
You could create a row index in each of the dataframe.
df1$row <- 1:nrow(df1)
df2$row <- 1:nrow(df2)
Then remove 3rd row in df2
.
df2 <- df2[-3, ]
You can then merge
by row
column both the dataframes.
merge(df1, df2, by = 'row')
# row pp trialNo item trialTarget trial V1 V2 V3 V4 V5 V6
#1 1 pp01 1 M012 script 1 764 764 763 763 762 763
#2 2 pp01 2 BS016 script 2 714 714 711 708 705 704
#3 4 pp01 4 BS010 script 4 730 728 727 724 722 719
#4 5 pp01 5 M006 script 5 789 786 788 790 792 790
#5 6 pp01 6 BS018 script 6 922 923 928 933 938 938
data
df1 <- structure(list(pp = c("pp01", "pp01", "pp01", "pp01", "pp01",
"pp01"), trialNo = 1:6, item = c("M012", "BS016", "M007", "BS010",
"M006", "BS018"), trialTarget = c("script", "script", "script",
"script", "script", "script"), trial = 1:6, row = 1:6), row.names = c(NA,
-6L), class = "data.frame")
df2 <- structure(list(V1 = c(764L, 714L, 872L, 730L, 789L, 922L), V2 = c(764L,
714L, 871L, 728L, 786L, 923L), V3 = c(763L, 711L, 869L, 727L,
788L, 928L), V4 = c(763L, 708L, 867L, 724L, 790L, 933L), V5 = c(762L,
705L, 867L, 722L, 792L, 938L), V6 = c(763L, 704L, 871L, 719L,
790L, 938L)), class = "data.frame", row.names = c(NA, -6L))