Search code examples
rmicrosoft-r

passing rows of a dataframe as a selection parameter to rxdatastep


I have a dataframe like this:

> DataSet_Fehler
    Ohne_Verschiebung    Mit_Verschiebung
1 2016-08-29 19:15:48 2016-08-29 19:19:34
2 2016-08-30 19:38:24 2016-08-30 19:42:18
3 2016-10-28 10:39:24 2016-10-28 10:42:48
4 2016-11-07 19:12:18 2016-11-07 19:15:45

I would like to filter my xdf file based on this dataframe (if I explain it in SQL):

SELECT *
   FROM Myxdf_file
   Where DataSet_Fehler[i,]$Ohne_Verschiebung < Date < DataSet_Fehler[i,]$Mit_Verschiebung

I think transformFunc could be my solution but I am not sure, but I dont know how can I implement it:

Filter_row<-function(DataSet_Fehler)
{
  return(DataSet_Fehler)
}
rxDataStep(inData = MyData,  transformFunc = Filter_row)

How can I do that?


Solution

  • You can pass a function to the rowSelection argument of rxDataStep, that references your data frame:

    # dates on which to filter your data
    filterDf <- read.csv(text=
    "2016-08-29 19:15:48, 2016-08-29 19:19:34
    2016-08-30 19:38:24, 2016-08-30 19:42:18
    2016-10-28 10:39:24, 2016-10-28 10:42:48
    2016-11-07 19:12:18, 2016-11-07 19:15:45
    ", header=FALSE, colClasses="POSIXct")
    
    # your xdf file
    indf <- read.csv(text="dt
    2016-08-29 19:16:00
    2016-08-29 19:20:00
    2016-08-30 19:40:00
    2016-09-01 12:00:00
    2016-11-07 19:14:00
    ", colClasses="POSIXct")
    inxdf <- rxDataStep(indf, "inxdf.xdf")
    
    rowFilter <- function(x, filterDf)
    {
        start <- filterDf[[1]]
        end <- filterDf[[2]]
        vapply(x, function(x) any(start < x & x < end), FUN.VALUE=logical(1))
    }
    
    rxDataStep(inxdf,
               rowSelection=fil(dt, filDf),
               transformObjects=list(fil=rowFilter, filDf=filterDf))
    #                   dt
    #1 2016-08-29 19:16:00
    #2 2016-08-30 19:40:00
    #3 2016-11-07 19:14:00