Search code examples
rfunctionapplymapply

Using a function and mapply in R to create new columns that sums other columns


Suppose, I have a dataframe, df, and I want to create a new column called "c" based on the addition of two existing columns, "a" and "b". I would simply run the following code:

df$c <- df$a + df$b

But I also want to do this for many other columns. So why won't my code below work?

# Reproducible data:

martial_arts <- data.frame(gym_branch=c("downtown_a", "downtown_b", "uptown", "island"),
                           day_boxing=c(5,30,25,10),day_muaythai=c(34,18,20,30),
                           day_bjj=c(0,0,0,0),day_judo=c(10,0,5,0),
                           evening_boxing=c(50,45,32,40), evening_muaythai=c(50,50,45,50),
                           evening_bjj=c(60,60,55,40), evening_judo=c(25,15,30,0))

# Creating a list of the new column names of the columns that need to be added to the martial_arts dataframe:

pattern<-c("_boxing","_muaythai","_bjj","_judo")
d<- expand.grid(paste0("martial_arts$total",pattern))

# Creating lists of the columns that will be added to each other:

e<- names(martial_arts %>% select(day_boxing:day_judo))
f<- names(martial_arts %>% select(evening_boxing:evening_judo))

# Writing a function and using mapply:

kick_him <- function(d,e,f){d <- rowSums(martial_arts[ , c(e, f)], na.rm=T)}

mapply(kick_him,d,e,f)

Now, mapply produces the correct results in terms of the addition:

> mapply(ff,d,e,f)
     Var1 <NA> <NA> <NA>
[1,]   55   84   60   35
[2,]   75   68   60   15
[3,]   57   65   55   35
[4,]   50   80   40    0

But it doesn't add the new columns to the martial_arts dataframe. The function in theory should do the following

martial_arts$total_boxing <- martial_arts$day_boxing + martial_arts$evening_boxing
...
...
martial_arts$total_judo <- martial_arts$day_judo + martial_arts$evening_judo

and add four new total columns to martial_arts.

So what am I doing wrong?


Solution

  • The assignment is wrong here i.e. instead of having martial_arts$total_boxing as a string, it should be "total_boxing" alone and this should be on the lhs of the Map/mapply. As the OP already created the 'martial_arts$' in 'd' dataset as a column, we are removing the prefix part and do the assignment

    kick_him <- function(e,f){rowSums(martial_arts[ , c(e, f)], na.rm=TRUE)}
    martial_arts[sub(".*\\$", "", d$Var1)] <- Map(kick_him, e, f)
    

    -check the dataset now

    > martial_arts
      gym_branch day_boxing day_muaythai day_bjj day_judo evening_boxing evening_muaythai evening_bjj evening_judo total_boxing total_muaythai total_bjj total_judo
    1 downtown_a          5           34       0       10             50               50          60           25           55             84        60         35
    2 downtown_b         30           18       0        0             45               50          60           15           75             68        60         15
    3     uptown         25           20       0        5             32               45          55           30           57             65        55         35
    4     island         10           30       0        0             40               50          40            0           50             80        40          0