Search code examples
group-bydistinct-valuesrevolution-r

Count distinct in a rxSummary


I want to count distinct values of var2 grouping by var1 in a .xdf file,

I tried something like this

 myFun <- function(dataList) {
    UniqueLevel <<- unique(c(UniqueLevel, dataList$var2))
    SumUniqueLevel <<- length(UniqueLevel)
    return(NULL)
    }

rxSummary(formula = ~ var1,
data = "DefModelo2.xdf",
transformFunc = myFun,
transformObjects = list(UniqueLevel = NULL),
removeZeroCounts = F)

Thank you in advance

EDIT:

Probably using RevoPemaR is the the faster way


Solution

  • One other option is to use rxCrossTabs. This way you get a cross-tabulation of the two factors, and you can just count non zero entries to determine unique values by one of the factors.

    censusWorkers <- file.path(rxGetOption("sampleDataDir"), "CensusWorkers.xdf")
    censusXtabAge <- rxCrossTabs(~ F(age):F(wkswork1), data = censusWorkers, 
                                 removeZeroCounts = FALSE, returnXtabs = TRUE)
    apply(censusXtabAge != 0, MARGIN = 1, sum)