I've got two dataframes that have same number of rows (22 rows) and different number of columns.
sim_10(22 rows, 15 columns):
2 0.577967 0.023869 0.021571 0.481754 0.61584 0 0 0 0 0 0.024057 0.014209 1 0.085784
8 0.0775 0.274113 2.7e-05 0.01215 0.009345 0 0 0 0 0 0.004092 0.00784 0 0
And how can I do it in easy way.. ...
nm_10(22 rows, 8 columns)
11 0.926554 0.256966 0.859375 0 0.191011 0 0 0
2 0.858757 0.256966 0.21875 0 0.662921 0 0.845506 0.090909
..
the first column of two dataframes are same just in different order(names of cases). I need to find the matching row names in nm_10 and sm_10 and subtract every element of sm_10 in that row to the every element in the nm_10. example:
for '2' sm_nm_10:
2 (0.577967-0.858757=-0.28079) (0.577967-0.256966=) (0.577967-0.21875) ...(0.577967-0.090909=..)
(0.023869-0.858757=) (0.023869-0.256966=) (0.023869-0.21875) ...(0.023869-0.090909=..)
....
(0.085784-0.858757=) (0.085784-0.256966=) (0.085784-0.21875) ...(0.085784-0.090909=..)
and that for all data. Check every row's first column, find matching row and do operation. Is there any simple way to do it? I looked into sweep, apply but couldn't figure out how to use them. I keep getting errors referring to length etc. I decided to keep it simple and here is what I have :
s = numeric()
for (i in 1:nrow(sm_10))
{
for (jj in 1:nrow(nm_10))
{
for (j in 2:ncol(nm_10))
{
for (ii in 2:ncol(sm_10))
{
sm_10[i,]%in% nm_10[jj,]
s <- sm_10[,ii]-nm_10[,j]
}}}}
What is wrong here? Could anyone explain and suggest better?
UPDATE:
The end result I need is all rows 22 with the elements subtractions. that is 22 rows with (14*7 ) columns..
I think the best solution here is to replicate the LHS by a sufficient multiplier such that it will then possess the desired output width, and then simply subtract the RHS from it. This will naturally be a vectorized subtraction and will cycle the RHS a sufficient number of times to fully cover the widened LHS. We must just take care to ensure that the pairing of elements is correct, which requires two things: (1) reorder the rows of the RHS such that the key values align with the LHS, and (2) replicate the LHS using the each
parameter of rep()
, rather than the times
parameter:
df1 <- as.data.frame(cbind(sample(1:22),matrix(1:(22*14),22)));
df2 <- as.data.frame(cbind(sample(1:22),matrix(1:(22*7),22)));
df1;
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15
## 1 22 1 23 45 67 89 111 133 155 177 199 221 243 265 287
## 2 20 2 24 46 68 90 112 134 156 178 200 222 244 266 288
## 3 13 3 25 47 69 91 113 135 157 179 201 223 245 267 289
## 4 12 4 26 48 70 92 114 136 158 180 202 224 246 268 290
## 5 16 5 27 49 71 93 115 137 159 181 203 225 247 269 291
## 6 7 6 28 50 72 94 116 138 160 182 204 226 248 270 292
## 7 1 7 29 51 73 95 117 139 161 183 205 227 249 271 293
## 8 2 8 30 52 74 96 118 140 162 184 206 228 250 272 294
## 9 9 9 31 53 75 97 119 141 163 185 207 229 251 273 295
## 10 14 10 32 54 76 98 120 142 164 186 208 230 252 274 296
## 11 4 11 33 55 77 99 121 143 165 187 209 231 253 275 297
## 12 21 12 34 56 78 100 122 144 166 188 210 232 254 276 298
## 13 15 13 35 57 79 101 123 145 167 189 211 233 255 277 299
## 14 10 14 36 58 80 102 124 146 168 190 212 234 256 278 300
## 15 8 15 37 59 81 103 125 147 169 191 213 235 257 279 301
## 16 6 16 38 60 82 104 126 148 170 192 214 236 258 280 302
## 17 19 17 39 61 83 105 127 149 171 193 215 237 259 281 303
## 18 3 18 40 62 84 106 128 150 172 194 216 238 260 282 304
## 19 5 19 41 63 85 107 129 151 173 195 217 239 261 283 305
## 20 18 20 42 64 86 108 130 152 174 196 218 240 262 284 306
## 21 17 21 43 65 87 109 131 153 175 197 219 241 263 285 307
## 22 11 22 44 66 88 110 132 154 176 198 220 242 264 286 308
df2;
## V1 V2 V3 V4 V5 V6 V7 V8
## 1 6 1 23 45 67 89 111 133
## 2 17 2 24 46 68 90 112 134
## 3 12 3 25 47 69 91 113 135
## 4 20 4 26 48 70 92 114 136
## 5 13 5 27 49 71 93 115 137
## 6 10 6 28 50 72 94 116 138
## 7 16 7 29 51 73 95 117 139
## 8 15 8 30 52 74 96 118 140
## 9 21 9 31 53 75 97 119 141
## 10 22 10 32 54 76 98 120 142
## 11 1 11 33 55 77 99 121 143
## 12 18 12 34 56 78 100 122 144
## 13 9 13 35 57 79 101 123 145
## 14 4 14 36 58 80 102 124 146
## 15 11 15 37 59 81 103 125 147
## 16 19 16 38 60 82 104 126 148
## 17 8 17 39 61 83 105 127 149
## 18 5 18 40 62 84 106 128 150
## 19 3 19 41 63 85 107 129 151
## 20 7 20 42 64 86 108 130 152
## 21 2 21 43 65 87 109 131 153
## 22 14 22 44 66 88 110 132 154
cbind(df1[,1],as.data.frame(rep(df1[,-1],each=ncol(df2)-1))-as.matrix(df2[match(df1[,1],df2[,1]),-1]));
## df1[, 1] V2 V2.1 V2.2 V2.3 V2.4 V2.5 V2.6 V3 V3.1 V3.2 V3.3 V3.4 V3.5 V3.6 V4 V4.1 V4.2 V4.3 V4.4 V4.5 V4.6 V5 V5.1 V5.2 V5.3 V5.4 V5.5 V5.6 V6 V6.1 V6.2 V6.3 V6.4 V6.5 V6.6 V7 V7.1 V7.2 V7.3 V7.4 V7.5 V7.6 V8 V8.1 V8.2 V8.3 V8.4 V8.5 V8.6 V9 V9.1 V9.2 V9.3 V9.4 V9.5 V9.6 V10 V10.1 V10.2 V10.3 V10.4 V10.5 V10.6 V11 V11.1 V11.2 V11.3 V11.4 V11.5 V11.6 V12 V12.1 V12.2 V12.3 V12.4 V12.5 V12.6 V13 V13.1 V13.2 V13.3 V13.4 V13.5 V13.6 V14 V14.1 V14.2 V14.3 V14.4 V14.5 V14.6 V15 V15.1 V15.2 V15.3 V15.4 V15.5 V15.6
## 1 22 -9 -31 -53 -75 -97 -119 -141 13 -9 -31 -53 -75 -97 -119 35 13 -9 -31 -53 -75 -97 57 35 13 -9 -31 -53 -75 79 57 35 13 -9 -31 -53 101 79 57 35 13 -9 -31 123 101 79 57 35 13 -9 145 123 101 79 57 35 13 167 145 123 101 79 57 35 189 167 145 123 101 79 57 211 189 167 145 123 101 79 233 211 189 167 145 123 101 255 233 211 189 167 145 123 277 255 233 211 189 167 145
## 2 20 -2 -24 -46 -68 -90 -112 -134 20 -2 -24 -46 -68 -90 -112 42 20 -2 -24 -46 -68 -90 64 42 20 -2 -24 -46 -68 86 64 42 20 -2 -24 -46 108 86 64 42 20 -2 -24 130 108 86 64 42 20 -2 152 130 108 86 64 42 20 174 152 130 108 86 64 42 196 174 152 130 108 86 64 218 196 174 152 130 108 86 240 218 196 174 152 130 108 262 240 218 196 174 152 130 284 262 240 218 196 174 152
## 3 13 -2 -24 -46 -68 -90 -112 -134 20 -2 -24 -46 -68 -90 -112 42 20 -2 -24 -46 -68 -90 64 42 20 -2 -24 -46 -68 86 64 42 20 -2 -24 -46 108 86 64 42 20 -2 -24 130 108 86 64 42 20 -2 152 130 108 86 64 42 20 174 152 130 108 86 64 42 196 174 152 130 108 86 64 218 196 174 152 130 108 86 240 218 196 174 152 130 108 262 240 218 196 174 152 130 284 262 240 218 196 174 152
## 4 12 1 -21 -43 -65 -87 -109 -131 23 1 -21 -43 -65 -87 -109 45 23 1 -21 -43 -65 -87 67 45 23 1 -21 -43 -65 89 67 45 23 1 -21 -43 111 89 67 45 23 1 -21 133 111 89 67 45 23 1 155 133 111 89 67 45 23 177 155 133 111 89 67 45 199 177 155 133 111 89 67 221 199 177 155 133 111 89 243 221 199 177 155 133 111 265 243 221 199 177 155 133 287 265 243 221 199 177 155
## 5 16 -2 -24 -46 -68 -90 -112 -134 20 -2 -24 -46 -68 -90 -112 42 20 -2 -24 -46 -68 -90 64 42 20 -2 -24 -46 -68 86 64 42 20 -2 -24 -46 108 86 64 42 20 -2 -24 130 108 86 64 42 20 -2 152 130 108 86 64 42 20 174 152 130 108 86 64 42 196 174 152 130 108 86 64 218 196 174 152 130 108 86 240 218 196 174 152 130 108 262 240 218 196 174 152 130 284 262 240 218 196 174 152
## 6 7 -14 -36 -58 -80 -102 -124 -146 8 -14 -36 -58 -80 -102 -124 30 8 -14 -36 -58 -80 -102 52 30 8 -14 -36 -58 -80 74 52 30 8 -14 -36 -58 96 74 52 30 8 -14 -36 118 96 74 52 30 8 -14 140 118 96 74 52 30 8 162 140 118 96 74 52 30 184 162 140 118 96 74 52 206 184 162 140 118 96 74 228 206 184 162 140 118 96 250 228 206 184 162 140 118 272 250 228 206 184 162 140
## 7 1 -4 -26 -48 -70 -92 -114 -136 18 -4 -26 -48 -70 -92 -114 40 18 -4 -26 -48 -70 -92 62 40 18 -4 -26 -48 -70 84 62 40 18 -4 -26 -48 106 84 62 40 18 -4 -26 128 106 84 62 40 18 -4 150 128 106 84 62 40 18 172 150 128 106 84 62 40 194 172 150 128 106 84 62 216 194 172 150 128 106 84 238 216 194 172 150 128 106 260 238 216 194 172 150 128 282 260 238 216 194 172 150
## 8 2 -13 -35 -57 -79 -101 -123 -145 9 -13 -35 -57 -79 -101 -123 31 9 -13 -35 -57 -79 -101 53 31 9 -13 -35 -57 -79 75 53 31 9 -13 -35 -57 97 75 53 31 9 -13 -35 119 97 75 53 31 9 -13 141 119 97 75 53 31 9 163 141 119 97 75 53 31 185 163 141 119 97 75 53 207 185 163 141 119 97 75 229 207 185 163 141 119 97 251 229 207 185 163 141 119 273 251 229 207 185 163 141
## 9 9 -4 -26 -48 -70 -92 -114 -136 18 -4 -26 -48 -70 -92 -114 40 18 -4 -26 -48 -70 -92 62 40 18 -4 -26 -48 -70 84 62 40 18 -4 -26 -48 106 84 62 40 18 -4 -26 128 106 84 62 40 18 -4 150 128 106 84 62 40 18 172 150 128 106 84 62 40 194 172 150 128 106 84 62 216 194 172 150 128 106 84 238 216 194 172 150 128 106 260 238 216 194 172 150 128 282 260 238 216 194 172 150
## 10 14 -12 -34 -56 -78 -100 -122 -144 10 -12 -34 -56 -78 -100 -122 32 10 -12 -34 -56 -78 -100 54 32 10 -12 -34 -56 -78 76 54 32 10 -12 -34 -56 98 76 54 32 10 -12 -34 120 98 76 54 32 10 -12 142 120 98 76 54 32 10 164 142 120 98 76 54 32 186 164 142 120 98 76 54 208 186 164 142 120 98 76 230 208 186 164 142 120 98 252 230 208 186 164 142 120 274 252 230 208 186 164 142
## 11 4 -3 -25 -47 -69 -91 -113 -135 19 -3 -25 -47 -69 -91 -113 41 19 -3 -25 -47 -69 -91 63 41 19 -3 -25 -47 -69 85 63 41 19 -3 -25 -47 107 85 63 41 19 -3 -25 129 107 85 63 41 19 -3 151 129 107 85 63 41 19 173 151 129 107 85 63 41 195 173 151 129 107 85 63 217 195 173 151 129 107 85 239 217 195 173 151 129 107 261 239 217 195 173 151 129 283 261 239 217 195 173 151
## 12 21 3 -19 -41 -63 -85 -107 -129 25 3 -19 -41 -63 -85 -107 47 25 3 -19 -41 -63 -85 69 47 25 3 -19 -41 -63 91 69 47 25 3 -19 -41 113 91 69 47 25 3 -19 135 113 91 69 47 25 3 157 135 113 91 69 47 25 179 157 135 113 91 69 47 201 179 157 135 113 91 69 223 201 179 157 135 113 91 245 223 201 179 157 135 113 267 245 223 201 179 157 135 289 267 245 223 201 179 157
## 13 15 5 -17 -39 -61 -83 -105 -127 27 5 -17 -39 -61 -83 -105 49 27 5 -17 -39 -61 -83 71 49 27 5 -17 -39 -61 93 71 49 27 5 -17 -39 115 93 71 49 27 5 -17 137 115 93 71 49 27 5 159 137 115 93 71 49 27 181 159 137 115 93 71 49 203 181 159 137 115 93 71 225 203 181 159 137 115 93 247 225 203 181 159 137 115 269 247 225 203 181 159 137 291 269 247 225 203 181 159
## 14 10 8 -14 -36 -58 -80 -102 -124 30 8 -14 -36 -58 -80 -102 52 30 8 -14 -36 -58 -80 74 52 30 8 -14 -36 -58 96 74 52 30 8 -14 -36 118 96 74 52 30 8 -14 140 118 96 74 52 30 8 162 140 118 96 74 52 30 184 162 140 118 96 74 52 206 184 162 140 118 96 74 228 206 184 162 140 118 96 250 228 206 184 162 140 118 272 250 228 206 184 162 140 294 272 250 228 206 184 162
## 15 8 -2 -24 -46 -68 -90 -112 -134 20 -2 -24 -46 -68 -90 -112 42 20 -2 -24 -46 -68 -90 64 42 20 -2 -24 -46 -68 86 64 42 20 -2 -24 -46 108 86 64 42 20 -2 -24 130 108 86 64 42 20 -2 152 130 108 86 64 42 20 174 152 130 108 86 64 42 196 174 152 130 108 86 64 218 196 174 152 130 108 86 240 218 196 174 152 130 108 262 240 218 196 174 152 130 284 262 240 218 196 174 152
## 16 6 15 -7 -29 -51 -73 -95 -117 37 15 -7 -29 -51 -73 -95 59 37 15 -7 -29 -51 -73 81 59 37 15 -7 -29 -51 103 81 59 37 15 -7 -29 125 103 81 59 37 15 -7 147 125 103 81 59 37 15 169 147 125 103 81 59 37 191 169 147 125 103 81 59 213 191 169 147 125 103 81 235 213 191 169 147 125 103 257 235 213 191 169 147 125 279 257 235 213 191 169 147 301 279 257 235 213 191 169
## 17 19 1 -21 -43 -65 -87 -109 -131 23 1 -21 -43 -65 -87 -109 45 23 1 -21 -43 -65 -87 67 45 23 1 -21 -43 -65 89 67 45 23 1 -21 -43 111 89 67 45 23 1 -21 133 111 89 67 45 23 1 155 133 111 89 67 45 23 177 155 133 111 89 67 45 199 177 155 133 111 89 67 221 199 177 155 133 111 89 243 221 199 177 155 133 111 265 243 221 199 177 155 133 287 265 243 221 199 177 155
## 18 3 -1 -23 -45 -67 -89 -111 -133 21 -1 -23 -45 -67 -89 -111 43 21 -1 -23 -45 -67 -89 65 43 21 -1 -23 -45 -67 87 65 43 21 -1 -23 -45 109 87 65 43 21 -1 -23 131 109 87 65 43 21 -1 153 131 109 87 65 43 21 175 153 131 109 87 65 43 197 175 153 131 109 87 65 219 197 175 153 131 109 87 241 219 197 175 153 131 109 263 241 219 197 175 153 131 285 263 241 219 197 175 153
## 19 5 1 -21 -43 -65 -87 -109 -131 23 1 -21 -43 -65 -87 -109 45 23 1 -21 -43 -65 -87 67 45 23 1 -21 -43 -65 89 67 45 23 1 -21 -43 111 89 67 45 23 1 -21 133 111 89 67 45 23 1 155 133 111 89 67 45 23 177 155 133 111 89 67 45 199 177 155 133 111 89 67 221 199 177 155 133 111 89 243 221 199 177 155 133 111 265 243 221 199 177 155 133 287 265 243 221 199 177 155
## 20 18 8 -14 -36 -58 -80 -102 -124 30 8 -14 -36 -58 -80 -102 52 30 8 -14 -36 -58 -80 74 52 30 8 -14 -36 -58 96 74 52 30 8 -14 -36 118 96 74 52 30 8 -14 140 118 96 74 52 30 8 162 140 118 96 74 52 30 184 162 140 118 96 74 52 206 184 162 140 118 96 74 228 206 184 162 140 118 96 250 228 206 184 162 140 118 272 250 228 206 184 162 140 294 272 250 228 206 184 162
## 21 17 19 -3 -25 -47 -69 -91 -113 41 19 -3 -25 -47 -69 -91 63 41 19 -3 -25 -47 -69 85 63 41 19 -3 -25 -47 107 85 63 41 19 -3 -25 129 107 85 63 41 19 -3 151 129 107 85 63 41 19 173 151 129 107 85 63 41 195 173 151 129 107 85 63 217 195 173 151 129 107 85 239 217 195 173 151 129 107 261 239 217 195 173 151 129 283 261 239 217 195 173 151 305 283 261 239 217 195 173
## 22 11 7 -15 -37 -59 -81 -103 -125 29 7 -15 -37 -59 -81 -103 51 29 7 -15 -37 -59 -81 73 51 29 7 -15 -37 -59 95 73 51 29 7 -15 -37 117 95 73 51 29 7 -15 139 117 95 73 51 29 7 161 139 117 95 73 51 29 183 161 139 117 95 73 51 205 183 161 139 117 95 73 227 205 183 161 139 117 95 249 227 205 183 161 139 117 271 249 227 205 183 161 139 293 271 249 227 205 183 161
For a demo that's easier to verify by eye, here I'll use three rows, five data columns on the LHS, and two data columns on the RHS:
df1 <- as.data.frame(cbind(sample(1:3),matrix(1:(3*5),3)));
df2 <- as.data.frame(cbind(sample(1:3),matrix(1:(3*2),3)));
df1;
## V1 V2 V3 V4 V5 V6
## 1 3 1 4 7 10 13
## 2 1 2 5 8 11 14
## 3 2 3 6 9 12 15
df2;
## V1 V2 V3
## 1 3 1 4
## 2 2 2 5
## 3 1 3 6
cbind(df1[,1],as.data.frame(rep(df1[,-1],each=ncol(df2)-1))-as.matrix(df2[match(df1[,1],df2[,1]),-1]));
## df1[, 1] V2 V2.1 V3 V3.1 V4 V4.1 V5 V5.1 V6 V6.1
## 1 3 0 -3 3 0 6 3 9 6 12 9
## 2 1 -1 -4 2 -1 5 2 8 5 11 8
## 3 2 1 -2 4 1 7 4 10 7 13 10
Notes:
rep()
must operate on df1[,-1]
. The -1
column subscript excludes the key column, which is assumed to be the first column in the data.frame.each
must be the number of subtrahends for each minuend, which means it also must exclude the key by subtracting one from ncol(df2)
.rep()
operates component-wise on the underlying list. But this works out for our purposes, because we can coerce back to data.frame with a call to as.data.frame()
, and it's as if each individual element was replicated horizontally within its row. We are then ready with the widened data.frame to serve as the LHS of the subtraction.match(df1[,1],df2[,1])
. This basically says "for each key value in df1
in the order they occur in df1
, return the row index in which that key value can be found in df2
." The resulting index vector can then be used to row-index df2
to order it to align with df1
. In the same index operation, we can exclude the key column of df2
, fully preparing it for the cyclic subtraction, thus we have df2[match(df1[,1],df2[,1]),-1]
.‘-’ only defined for equally-sized data frames
). Thus I had to add an as.matrix()
call on the RHS before subtracting. Another possible solution here could be to replicate the RHS to match the size of the LHS.cbind()
call wrapping the subtraction, which prepends the key column from df1
(df1[,1]
).names()
/setNames()
/colnames()
/dimnames()
.