I am new to R programming and a bit stumped on this question. I have 3 different databases and I want to extract 2 sets of test and train data from them. I know how to do this with individual line of codes but that is repetitive task hence I want to automate this using function.
I want my output name to carry Data base name, followed by test or train, followed by number and I want to export them directly into Global environment. Here is the code that I wrote (Using MTCARS) as sample data base which does not work. I have also given my desired output
Just to be clear this is just an example. There are different areas where I want to customise output from function(x) in global environment without needing to write the code over and over again.
mydata<-mtcars
output_2_set_test_train <-function(x){
library(caTools)
i=1
dfname <- deparse(substitute(x))
while(i<=2){
sample.split(x[[1]], SplitRatio = .75)->split_tag
subset(x, split_tag==T)->> paste0(dfname,"_","train","_",i)
subset(x, split_tag==F)->> paste0(dfname,"_","test","_",i)
i=i+1
}
}
output_2_set_test_train(mydata)
After running my function, I want my global environment to have the following data frames with this specific name -
#1 data frame named -> mydata_train_1 #2 data frame named -> mydata_test_1 #3 data frame named -> mydata_train_2 #4 data frame named -> mydata_test_2
I tried to do a lot of search on this but couldn't get any answers that work.
Can someone please help me to correct the code?
Please see Konrad's warnings about this type of coding and also about the conventions in R. Almost nothing in your code is idiomatic.
You would need to replace these lines:
subset(x, split_tag==T)->> paste0(dfname,"_","train","_",i)
subset(x, split_tag==F)->> paste0(dfname,"_","test","_",i)
with something like:
assign(paste0(dfname,"_","train","_",i), subset(x, split_tag==T), envir = globalenv())
assign(paste0(dfname,"_","test","_",i), subset(x, split_tag==F), envir = globalenv())