Search code examples
rdataframeread.csvcolumn-types

Change the coltypes based on a substring in colnames


I have a very large data frame of sales data (df8). When loading in, some of the variables that I want to be numeric loaded as chr. I want to change every column where the colname contains the word "Order" from chr to numeric. How can I do this?


Solution

  • I would use function grepl to find the occurrences of "order" and go through each column and convert to numeric. Notice that the variables are actually characters and it won't work if your data is a factor (that will need (as.numeric(as.character(x))).

    # create data.frame with characters
    xy <- data.frame(a = runif(5), b.order = runif(5), cOrder = runif(5))
    xy[, c(2, 3)] <- sapply(xy[, c(2, 3)], FUN = as.character)
    str(xy)
    'data.frame':   5 obs. of  3 variables:
     $ a      : num  0.914 0.468 0.106 0.624 0.841
     $ b.order: chr  "0.363523897947744" "0.56488766730763" "0.42081760126166" "0.560672372812405" ...
     $ cOrder : chr  "0.949268750846386" "0.596737345447764" "0.368769273394719" "0.717566329054534" ...
    
    with.order <- grepl("order", names(xy), ignore.case = TRUE)
    
    xy[, with.order] <- sapply(xy[, with.order], FUN = as.numeric)
    str(xy)
    'data.frame':   5 obs. of  3 variables:
     $ a      : num  0.914 0.468 0.106 0.624 0.841
     $ b.order: num  0.364 0.565 0.421 0.561 0.768
     $ cOrder : num  0.949 0.597 0.369 0.718 0.417