Search code examples
rdataframefunctioncustomization

Customised output name in global environment using R functions


I am new to R programming and a bit stumped on this question. I have 3 different databases and I want to extract 2 sets of test and train data from them. I know how to do this with individual line of codes but that is repetitive task hence I want to automate this using function.

I want my output name to carry Data base name, followed by test or train, followed by number and I want to export them directly into Global environment. Here is the code that I wrote (Using MTCARS) as sample data base which does not work. I have also given my desired output

Just to be clear this is just an example. There are different areas where I want to customise output from function(x) in global environment without needing to write the code over and over again.

mydata<-mtcars
 
output_2_set_test_train <-function(x){

  library(caTools)
  i=1
  dfname <- deparse(substitute(x))

  while(i<=2){
    sample.split(x[[1]], SplitRatio = .75)->split_tag
    subset(x, split_tag==T)->> paste0(dfname,"_","train","_",i)
    subset(x, split_tag==F)->> paste0(dfname,"_","test","_",i)
    i=i+1
  }
}
 
output_2_set_test_train(mydata)

After running my function, I want my global environment to have the following data frames with this specific name -

#1 data frame named -> mydata_train_1 #2 data frame named -> mydata_test_1 #3 data frame named -> mydata_train_2 #4 data frame named -> mydata_test_2

I tried to do a lot of search on this but couldn't get any answers that work.

Can someone please help me to correct the code?


Solution

  • Please see Konrad's warnings about this type of coding and also about the conventions in R. Almost nothing in your code is idiomatic.

    You would need to replace these lines:

    subset(x, split_tag==T)->> paste0(dfname,"_","train","_",i)
    subset(x, split_tag==F)->> paste0(dfname,"_","test","_",i)
    

    with something like:

    assign(paste0(dfname,"_","train","_",i), subset(x, split_tag==T), envir = globalenv())
    assign(paste0(dfname,"_","test","_",i), subset(x, split_tag==F), envir = globalenv())