given the following reproducible example
my objective is to row-wise substitute the original values with NA in adjacent columns of a data frame; I know it's a problem (with so many variants) already posted but I've not yet found the solution with the approach I'm trying to accomplish: i.e. by applying a function composition
in the reproducible example the column driving the substitution with NA of the original values is column a
this is what I've done so far
the very last code snippet is a failing attempt of what I'm actually searching for...
# ifelse approach, it works but...
# it's error prone: i.e. copy and paste for all columns can introduce a lot of troubles
df<-data.frame(a=c(1, 2, NA), b=c(3, NA, 4), c=c(NA, 5, 6))
df$b<-ifelse($a), NA, df$b)
df$c<-ifelse($a), NA, df$c)
# extraction and subsitution approach
# same as above
df<-data.frame(a=c(1, 2, NA), b=c(3, NA, 4), c=c(NA, 5, 6))
# definition of a function
# it's a bit better, but still error prone because of the copy and paste
df<-data.frame(a=c(1, 2, NA), b=c(3, NA, 4), c=c(NA, 5, 6))
ifelse(, NA, y)
df$b<-fix(df$a, df$b)
df$c<-fix(df$a, df$c)
# this approach is not working as expected!
# the idea behind is of function composition;
# lapply does the fix to some columns of data frame
df<-data.frame(a=c(1, 2, NA), b=c(3, NA, 4), c=c(NA, 5, 6))
df[]<-lapply(df, fix2)
any help for this particular approach? I'm stuck on how to properly conceive the substitute function passed to lapply
If you use lexical closureing - you define a function which generates first the function you need. And then you can use this function as you wish.
# given a column all other columns' values at that row should become NA
# if the driver column's value at that row is NA
# using lexical scoping of R function definitions, one can reach that.
df<-data.frame(a=c(1, 2, NA), b=c(3, NA, 4), c=c(NA, 5, 6))
# whatever vector given, this vector's value should be changed
# according to first column's value
na_accustomizer <- function(df, driver_col) {
## Returns a function which will accustomize any vector/column
## to driver column's NAs
function(vec) {
vec[[, driver_col])] <- NA
df[] <- lapply(df, na_accustomizer(df, "a"))
## a b c
## 1 1 3 NA
## 2 2 NA 5
## 3 NA NA NA
# na_accustomizer(df, "a") returns
# function(vec) {
# vec[[, "a"])] <- NA
# vec
# }
# which then can be used like you want:
# df[] <- lapply(df, na_accustomize(df, "a"))
df<-data.frame(a=c(1, 2, NA), b=c(3, NA, 4), c=c(NA, 5, 6))
# define it for one column
overtake_NA <- function(df, driver_col, target_col) {
df[, target_col] <- ifelse([, driver_col]), NA, df[, target_col])
# define it for all columns of df
overtake_driver_col_NAs <- function(df, driver_col) {
for (i in 1:ncol(df)) {
df <- overtake_NA(df, driver_col, i)
overtake_driver_col_NAs(df, "a")
# a b c
# 1 1 3 NA
# 2 2 NA 5
# 3 NA NA NA
driver_col_to_other_cols <- function(df, driver_col, pred) {
## overtake any value of the driver column to the other columns of df,
## whenever predicate function (pred) is fulfilled.
# define it for one column
overtake_ <- function(df, driver_col, target_col, pred) {
selectors <-, list(df[, driver_col]))
if (deparse(substitute(pred)) != "") {
# this is to 'recorrect' NA's which intrude into the selector vector
# then driver_col has NAs. For sure "" is not the only possible
# way to check for NA - so this edge case is not covered fully
selectors[] <- FALSE
df[, target_col] <- ifelse(selectors, df[, driver_col], df[, target_col])
for (i in 1:ncol(df)) {
df <- overtake_(df, driver_col, i, pred)
driver_col_to_other_cols(df, "a", function(x) x == 1)
# a b c
# 1 1 1 1
# 2 2 NA 5
# 3 NA 4 6
## if the "" check is not done, then this would give
## (because of NA in selectorvector):
# a b c
# 1 1 1 1
# 2 2 NA 5
# 3 NA NA NA
## hence in the case that pred doesn't check for NA in 'a',
## these NA vlaues have to be reverted to the original columns' value.
driver_col_to_other_cols(df, "a",
# a b c
# 1 1 3 NA
# 2 2 NA 5
# 3 NA NA NA