Search code examples
rprintingvariable-assignmentapplycalculated-columns

Print column name in an r apply and save as new column on a dataframe


I've written an apply where I want to 'loop' over a subset of the columns in a dataframe and print some output. For the sake of an example I'm just transforming based on dividing one column by another (I know there are other ways to do this) so we have:

apply(df[c("a","b","c")],2,function(x){
z <- a/df[c("divisor")]
    }
)

I'd like to print the column name currently being operated on, but colnames(x) (for example) doesn't work.

Then I want to save a new column, based on each colname (a.exp,b.exp or whatever) into the same df.


Solution

  • I think a lot of people will look for this same issue, so I'm answering my own question (having eventually found the answers). As below, there are other answers to both parts (thanks!) but non-combining these issues (and some of the examples are more complex).

    First, it seems the "colnames" element really isn't something you can get around (seems weird to me!), so you 'loop' over the column names, and within the function call the actual vectors by name [c(x)].

    Then the key thing is that to assign, so create your new columns, within an apply, you use '<<'

    apply(colnames(df[c("a","b","c")]),function(x) {
    z <- (ChISEQCIS[c(paste0(x))]/ChISEQCIS[c("V1")])
    ChISEQCIS[c(paste0(x,"ind"))] <<- z
    }
    )
    

    The << is discussed e.g. https://stackoverflow.com/questions/2628621/how-do-you-use-scoping-assignment-in-r

    I got confused because I only vaguely thought about wanting to save the outputs initially and I figured I needed both the column (I incorrectly assumed apply worked like a loop so I could use a counter as an index or something) and that there should be same way to get the name separately (e.g. colname(x)).

    There are a couple of related stack questions: