Search code examples
rdataframesplitr-colnames

Split characters in column names to new columns with logical values in rows


I am trying to split column name strings into separate columns but the issue I am facing is that the rows have logical values. There are a few posts that split columns with strings in the rows but I could not find any posts with logical values.

My data.frame looks something like this:

mydf <- data.frame (author = c("N1", "N2", "N3"),
Aa..Ab = c(T, T, F),
BB = c(T, F, T),
Ca...Cb = c(F, F, T))

The result should look something like this

mydfnew <- data.frame (author = c("N1", "N2", "N3"),
 Aa = c(T, T, F),
 Ab = c(T, T, F),
 BB = c(T, F, T),
 Ca = c(F, F, T),
 Cb = c(F, F, T))

I have tried to adjust code that splits columns and names (Split character in column and name) as follows:

splitCol <- function(dataframe, splitVars=names(dataframe)){
  split.DF <- dataframe[,splitVars]
  keep.DF <- dataframe[, !names(dataframe) %in% c(splitVars)]

  X <- function(x)matrix(unlist(rep(x)), byrow=TRUE)

  newdf <- as.data.frame(do.call(cbind, suppressWarnings(lapply(split.DF, X))) )
  names(newdf) <- paste(rep(names(split.DF), each=2), c(".a", ".b"), sep="") 
  data.frame(keep.DF,newdf)
}

When calling

splitCol(mydf) 

I get the error:

Error in names(newdf) <- paste(rep(names(split.DF), each = 2), c(".a", : 'names' attribute [8] must be the same length as the vector [4]


Solution

  • Here is an approach using replicate and Map

    as.data.frame(Map(x = strsplit(names(mydf), '[.]+'), 
                      DATA = mydf, 
                      f = function(x,DATA){
                        setNames(replicate(length(x), DATA, simplify = FALSE),x  )}
                 ))
    ##    author    Aa    Ab    BB    Ca    Cb
    ##  1     N1  TRUE  TRUE  TRUE FALSE FALSE
    ##  2     N2  TRUE  TRUE FALSE FALSE FALSE
    ##  3     N3 FALSE FALSE  TRUE  TRUE  TRUE