Search code examples
rff

How can you apply a function or logical test to a ffdf?


This is basically asking how to apply a previous question to a ffdf : R - applying ifelse to a whole data frame

I am basically trying to do the same thing. Given a ffdf dataframe, I am basically asking how to apply an ifelse to the entire dataframe. See the two examples that work on test (a data frame). I'm trying to do the same to test.ffdf. I've tried googling the q and people suggest that to get at the normal df part of an ffdf, use physical(). eg: How to use apply or sapply or lapply with ffdf? I can use this to see the data, but doesn't seem to return an actionable vector.

test<-data.frame(year=c("1990","1991","","1993"),value=c(50,25,20,5), type=c('puppies', '', 'hello', 'party'))

test.ffdf = as.ffdf(test)

lapply(test, function(x) type.convert(as.character(x), na.strings = ""))
test[test==''] = NA

lapply(physical(test.ffdf), function(x) type.convert(as.character(x), na.strings = ""))
physical(test.ffdf)[physical(test.ffdf)=='']

Similarly, I can perform a logical test like

test.ffdf$value > 20

but I can't seem to find a way to apply it to the whole ffdf like you could a df.


Solution

  •  test.ffdf[,1:3][test.ffdf[,1:3]==''] <- NA
    
      physical(test.ffdf)
     #$year
     #ff (open) integer length=4 (4) levels:  1990 1991 1993
     # [1]  [2]  [3]  [4] 
     #1990 1991 NA   1993 
    
     #$value
     #ff (open) double length=4 (4)
     #[1] [2] [3] [4] 
     #50  25  20   5 
    
     #$type
     #ff (open) integer length=4 (4) levels:  hello party puppies
     #  [1]     [2]     [3]     [4] 
     #  puppies NA      hello   party   
    
    library(ffbase)
    test.ffdf <- droplevels(test.ffdf)
    
    str(test.ffdf[,names(test.ffdf)])
    # 'data.frame': 4 obs. of  3 variables:
    #  $ year : Factor w/ 3 levels "1990","1991",..: 1 2 NA 3
    #  $ value: num  50 25 20 5
    #  $ type : Factor w/ 3 levels "hello","party",..: 3 NA 1 2