Search code examples
rdataframelapplycoercion

How to coerce multiple columns for multiple data.frames as character in R?


I want to coerce all columns for multiple data.frames to character in order to rbind them later. The problem is I can't create the appropriate function to use it within lapply.

# Fake dataset
set.seed(123)
A = as.data.frame(matrix(sample(c('NA',1:10),10*10,T),10))
B = as.data.frame(matrix(sample(c('NA',LETTERS[1:10]),10*10,T),10))
C = as.data.frame(matrix(sample(c('NA',letters[1:10]),10*10,T),10))

In principle, this task should be as simple as:

target = list(A, B, C)
lapply(target, function(x) {
  x <- as.character(x)
}) -> df

But, when I run str(df), I get this:

List of 3
 $ : chr [1:10] "c(\"2\", \"2\", \"9\", \"1\", \"5\", \"10\", \"4\", \"3\", \"5\", \"8\")" "c(\"9\", \"10\", \"4\", \"2\", \"10\", \"8\", \"8\", \"8\", \"2\", \"7\")" "c(\"9\", \"6\", \"9\", \"8\", \"2\", \"3\", \"NA\", \"10\", \"6\", \"4\")" "c(\"9\", \"6\", \"8\", \"8\", \"9\", \"6\", \"10\", \"4\", \"6\", \"4\")" ...
 $ : chr [1:10] "c(\"G\", \"B\", \"G\", \"NA\", \"F\", \"J\", \"F\", \"F\", \"I\", \"E\")" "c(\"F\", \"J\", \"I\", \"D\", \"E\", \"G\", \"D\", \"F\", \"J\", \"C\")" "c(\"B\", \"H\", \"F\", \"E\", \"I\", \"H\", \"F\", \"A\", \"B\", \"G\")" "c(\"C\", \"F\", \"C\", \"NA\", \"G\", \"C\", \"H\", \"G\", \"E\", \"J\")" ...
 $ : chr [1:10] "c(\"h\", \"h\", \"d\", \"f\", \"e\", \"j\", \"NA\", \"i\", \"i\", \"NA\")" "c(\"i\", \"

Next try is with:

lapply(target, function(x,i) {
    x[,i] <- as.character(x[,i])
return(x)}) -> df

This returns 3 data.frames as expected but the str is not what I want (partial output for first data.frame):

 $ :'data.frame':       10 obs. of  10 variables:
  ..$ V1 : chr [1:10] "c(\"2\", \"2\", \"9\", \"1\", \"5\", \"10\", \"4\", \"3\", \"5\", \"8\")" "c(\"9\", \"10\", \"4\", \"2\", \"10\", \"8\", \"8\", \"8\", \"2\", \"7\")" "c(\"9\", \"6\", \"9\", \"8\", \"2\", \"3\", \"NA\", \"10\", \"6\", \"4\")" "c(\"9\", \"6\", \"8\", \"8\", \"9\", \"6\", \"10\", \"4\", \"6\", \"4\")" ...
  ..$ V2 : chr [1:10] "c(\"2\", \"2\", \"9\", \"1\", \"5\", \"10\", \"4\", \"3\", \"5\", \"8\")" "c(\"9\", \"10\", \"4\", \"2\", \"10\", \"8\", \"8\", \"8\", \"2\", \"7\")" "c(\"9\", \"6\", \"9\", \"8\", \"2\", \"3\", \"NA\", \"10\", \"6\", \"4\")" "c(\"9\", \"6\", \"8\", \"8\", \"9\", \"6\", \"10\", \"4\", \"6\", \"4\")" ...
  ..$ V3 : chr [1:10] "c(\"2\", \"2\", \"9\", \"1\", \"5\", \"10\", \"4\", \"3\", \"5\", \"8\")" "c(\"9\", \"10\", \"4\", \"2\", \"10\", \"8\", \"8\", \"8\", \"2\", \"7\")" "c(\"9\", \"6\", \"9\", \"8\", \"2\", \"3\", \"NA\", \"10\", \"6\", \"4\")" "c(\"9\", \"6\", \"8\", \"8\", \"9\", \"6\", \"10\", \"4\", \"6\", \"4\")" ...
  ..$ V4 : chr [1:10] "c(\"2\", \"2\", \"9\", \"1\", \"5\", \"10\", \"4\", \"3\", \"5\

So basically I'm stuck and I don't know what else I can do so any advice will be much appreciated.


Solution

  • You must coerce the columns of each data.frame to character one by one.

    lapply(target, function(x) {
      x[] <- lapply(x, as.character)
      x
    }) -> target
    
    df <- do.call(rbind, target)