I am trying to write a loop in R to perform some iteration on two datasets called datasetA
and datasetB
.
datasetA
has 600 entries and datasetB
has 200’000 entries.
For each entry in datasetA
, I want to perform the following:
If the value of V2
in both datasets are equal,
then calculate the ppm:
(datasetA$V3 - datasetB$V3) / datasetA$V3 * 1000000
If the ppm < |10|, then paste the ppm value in V4
column in datasetB
, paste the relevant name of datasetA$V1
in column V1
of datasetB
.
Say this is datasetA
with 600 entries:
datasetA<- read.table(text='Alex 1 50.00042
John 1 60.000423
Janine 3 88.000123
Aline 3 117
Mark 2 79.9999')
and this is an example of datasetB
with 200000 entries:
datasetB<- read.table(text='NA 1 50.0001 NA
NA 1 50.00032 NA
NA 2 70 NA
NA 2 80 NA
NA 3 88.0004 NA
NA 3 100 NA
NA 3 101 NA
NA 2 102 NA')
The final table should look like this:
datasetC <- read.table(text='Alex 1 50.0001 6.459945
Alex 1 50.00032 2.059983
NA 2 70 NA
Mark 2 80 -1.25
Janine 3 88.0004 -3.14772
NA 3 100 NA
NA 3 101 NA
NA 2 102 NA')
data<-datasetB
for(i in 1:5){
for(j in 1:8){
if (datasetA$V2[i]==datasetB$V2[j] & abs((datasetA$V3[i]-datasetB$V3[j])/datasetA$V3[i]*10**6)<10){
data[j,1]=datasetA[i,1]
data[j,4]=(datasetA$V3[i]-datasetB$V3[j])/datasetA$V3[i]*10**6
}}}
data