Search code examples
rscalenumeric

select numeric columns and one column specified by name from data frame


I have a data frame which contains both numeric and non-numeric columns, say

df <- data.frame(v1=1:20,v2=1:20,v3=1:20,v4=letters[1:20],v5=letters[1:20])

To select only the non-numeric columns I would use

fixCol <- !sapply(df,is.numeric)

But now I also want to include a specific numeric column, say v2. My data frame is very big and the order of the columns changes, so I cannot index it using a number, I really want to use the name 'v2'. I tried

fixCol$v2 = TRUE

but that gives me the warning In fixCol$FR = TRUE : Coercing LHS to a list which makes it impossible to subset my original data frame to get only fixCol

df[,fixCol]

gives: Error in .subset(x, j) : invalid subscript type 'list'

In the end my goal is to scale all numeric columns of my data frame except this one specified column, using something like this

scaleCol = !fixCol
df_scaled = cbind(df[,fixCol], sapply(df[,scaleCol],scale))

How can I best do this?


Solution

  • We can use a OR condition (|) to get a logical index and then subset the columns of 'df'.

    df1 <- df[!sapply(df, is.numeric)|names(df)=='v2']
    head(df1,2)
    #  v2 v4 v5
    #1  1  a  a
    #2  2  b  b