Search code examples
rfor-loopnested-loopsdata-retrievalnested-for-loop

How do I retrieve a value from a row containing two keywords of interest to make a new column on R?


I have a data table that looks like this,

And what I basically want to do is to create a new column in the table containing the fold change in my 'readout' for each sample, which will be, for example,

Sample 1 at WEEK 0/Sample 1 at WEEK 0

Sample 1 at WEEK 4/Sample 1 at WEEK 0

Sample 1 at WEEK 14/Sample 1 at WEEK 0

and so on and so forth for all the time points for Sample 1 and then calculate the same thing for the rest of my samples using their respective 'readout' from WEEK 0.

So far, what I have tried is something along the lines of,

r
SampleIDs<-as.character(unique(table$ID))

table$FC<-for(i in table[i,]){
for(j in SampleIDs){

if(table[i,"ID"]==j){

    table[i,3]/table[(("WEEK"==0)&("ID"==j)),3]
    }
    }
  }

}

When run, the code returns the error,

Error in if (table[i, "SampleID"] == j) { : argument is of length zero

What I was trying to do was to create a separate vector with unique IDs, and use that in a for function to go row by row to make sure that the row contains the sample with the same ID, and then try to retrieve the cell that contains the data for the sample with ID j AND is from WEEK 0 to calculate my fold change value. Any help on how to do this would be greatly appreciated! Thank you


Solution

  • May be, we could group by 'ID' and create a new column by dividing the 'readout' with the 'readout' where 'WEEK' is 0

    library(dplyr)
    df1 %>% 
        group_by(ID) %>% 
        mutate(new = readout/readout[WEEK == 0])
    

    If the 'WEEK' is already orderd

    df1 %>%
        group_by(ID) %>%
        mutate(new = readout/readout[1])
    

    Or with data.table

    library(data.table)
    setDT(df1)[, new := readout/readout[WEEK == 0], by = ID]
    

    If it is already ordered

    setDT(df1)[, new := readout/readout[1], by = ID]
    

    Or using base R

    df1$new <- with(df1, readout/setNames(readout[WEEK == 0], unique(ID))[ID])
    

    Regarding the console showing +, it is just a symbol showing the expression is not complete

    enter image description here

    This we get in other consoles as well e.g. In Julia, the REPL would not show any symbol but it would give the output after the full expression is completed

    enter image description here

    data

    df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L), 
        WEEK = c(0, 4, 14, 24, 0, 4, 0, 4, 14, 24), readout = c(5, 
        6, 7, 8, 1, 1.5, 1, 1, 5, 3)), class = "data.frame", row.names = c(NA, 
    -10L))