I believe this one is a complex problem and I will try to make it easy to understand.
I have 3 dataframes such as:
NS_3<-as.data.frame(cbind(c("3","3","3","3","3"),c("341007","325001","324003","524302","346002")))
NS_4<-as.data.frame(cbind(c("4","4","4","4","4","4","4"),c("341007","270001","270001","521009","346001","524302","335104")))
NS_15<-as.data.frame(cbind(c("15","15","15","15","15"),c("301001","301001","316104","344003","291003")))
names(NS_3)<-c("NS", "Pred FAILCODE TEST")
names(NS_4)<-c("NS", "Pred FAILCODE TEST")
names(NS_15)<-c("NS", "Pred FAILCODE TEST")
What I would like to do is to:
1) Check if dataframes NS_4
and NS_15
contain the values of each row of NS_3$Pred FAILCODE TEST
.
2) If this value is present in a certain dataframe, then it should count and storage all the values of the Pred FAILCODE TEST
of this dataframe, except for the value found.
For instance:
For the first Pred FAILCODE TEST
value in NS_3
, check if 341007
is present in NS_4
and NS_15
.
Once this check is TRUE
in NS_4
, then it should count the frequency of all the NS_4$Pred FAILCODE TEST
values, except for the one in question (which is 341007
).
Hence, the result for the first loop should be
Results for the first loop 341007
For the second and third value of NS_3$Pred FAILCODE TEST
, as neither 325001
nor 324003
appear at any dataframe, they should not be considered.
For the fourth value 524302
, the result should be something like:
FAILCODES 524302
341007 1
270001 2
521009 1
346001 1
335104 1
Once the loop ends with NS_3$Pred FAILCODE TEST
values, then it should do the same thing to NS_4$Pred FAILCODE TEST
values, searching them in NS_3
and NS_15
. After done with NS_4, it should do the same with NS_15
, searching the
NS_15$Pred FAILCODE TEST
values at NS_3
and NS_4
.
I believe it will require nested for-loops to go through each row of each dataframe. Also, dflist<-list(df1=NS_3,df2=NS_4,df3=NS_15)
would be probably helpful in those loops.
Actually I have around 70 different dataframes and 50 different Pred FAILCODE TEST
values to check in each dataframe.
I hope it is clear, if you guys need more infos please let me know!
Think this does it,
#your code
NS_3<-as.data.frame(cbind(c("3","3","3","3","3"),c("341007","325001","324003","524302","346002")))
NS_4<-as.data.frame(cbind(c("4","4","4","4","4","4","4"),c("341007","270001","270001","521009","346001","524302","335104")))
NS_15<-as.data.frame(cbind(c("15","15","15","15","15"),c("301001","301001","316104","344003","291003")))
names(NS_3)<-c("NS", "Pred FAILCODE TEST")
names(NS_4)<-c("NS", "Pred FAILCODE TEST")
names(NS_15)<-c("NS", "Pred FAILCODE TEST")
#Make a vector of your Tables suffixes
df_index <- c(3,4,15)
#Essentially rbind() all of tables in your df_index
#there is probably an elegant way to do this with do.call()
input <- eval(parse(text = paste0("rbind(",
paste0("NS_", df_index, collapse = ","),
")")
)
)
require(dplyr)
require(magrittr)
#convert from factor to numeric
input$`Pred FAILCODE TEST` <- as.numeric(as.character(input$`Pred FAILCODE TEST`))
input$NS <- as.numeric(as.character(input$NS))
#make a compressed table of frequencies
input %>% group_by(NS, `Pred FAILCODE TEST`) %>%
summarize(n=n()) -> compressTBL
#little function to look up each record and compare
Lookup <- function(NS, FailCode){
input$NS[input$`Pred FAILCODE TEST` == FailCode & !input$NS == NS]
}
#the output, a list, each column is row in your input table
output <- sapply(X = 1:nrow(input),
FUN = function(x){
compressTBL[compressTBL$NS == Lookup(input$NS[x], input$`Pred FAILCODE TEST`[x]),]
})
#The only records with values are 1,4,6,11
output
#same as what you got in your loop
as.data.frame(output[,4]) #4th record 524302