I've found that across()
is very useful for repeating operations on several columns.
However, I still haven't fully understood how to select specific columns for the operation.
Let's say that I want to apply a function to all columns in mtcars
, except gear
and carb
.
I tried something like
# Function to use over columns
demean <- function(x) {
x - mean(x, na.rm = TRUE)
}
# Use function on all but columns gear and carb
mtcars %>% mutate(across(.cols = select(.,-gear,-carb), demean))
However, this throws the error
Error: Problem with `mutate()` input `..1`.
x Must subset columns with a valid subscript vector.
x Subscript has the wrong type `data.frame<
What is the proper way to unselect certain columns in across
?
It's easier than you think:
mtcars %>% mutate(across(-c(gear, carb), demean))
mpg cyl disp hp drat wt
Mazda RX4 0.909375 -0.1875 -70.721875 -36.6875 0.3034375 -0.59725
Mazda RX4 Wag 0.909375 -0.1875 -70.721875 -36.6875 0.3034375 -0.34225
Datsun 710 2.709375 -2.1875 -122.721875 -53.6875 0.2534375 -0.89725
Hornet 4 Drive 1.309375 -0.1875 27.278125 -36.6875 -0.5165625 -0.00225
Hornet Sportabout -1.390625 1.8125 129.278125 28.3125 -0.4465625 0.22275
Valiant -1.990625 -0.1875 -5.721875 -41.6875 -0.8365625 0.24275
Duster 360 -5.790625 1.8125 129.278125 98.3125 -0.3865625 0.35275
Merc 240D 4.309375 -2.1875 -84.021875 -84.6875 0.0934375 -0.02725
Merc 230 2.709375 -2.1875 -89.921875 -51.6875 0.3234375 -0.06725
qsec vs am gear carb
Mazda RX4 -1.38875 -0.4375 0.59375 4 4
Mazda RX4 Wag -0.82875 -0.4375 0.59375 4 4
Datsun 710 0.76125 0.5625 0.59375 4 1
Hornet 4 Drive 1.59125 0.5625 -0.40625 3 1
Hornet Sportabout -0.82875 -0.4375 -0.40625 3 2
Valiant 2.37125 0.5625 -0.40625 3 1
Duster 360 -2.00875 -0.4375 -0.40625 3 4
Merc 240D 2.15125 0.5625 -0.40625 4 2
Merc 230 5.05125 0.5625 -0.40625 4 2
[ reached 'max' / getOption("max.print") -- omitted 23 rows ]