I have a data table that looks like this,
And what I basically want to do is to create a new column in the table containing the fold change in my 'readout' for each sample, which will be, for example,
Sample 1 at WEEK 0/Sample 1 at WEEK 0
Sample 1 at WEEK 4/Sample 1 at WEEK 0
Sample 1 at WEEK 14/Sample 1 at WEEK 0
and so on and so forth for all the time points for Sample 1 and then calculate the same thing for the rest of my samples using their respective 'readout' from WEEK 0.
So far, what I have tried is something along the lines of,
r
SampleIDs<-as.character(unique(table$ID))
table$FC<-for(i in table[i,]){
for(j in SampleIDs){
if(table[i,"ID"]==j){
table[i,3]/table[(("WEEK"==0)&("ID"==j)),3]
}
}
}
}
When run, the code returns the error,
Error in if (table[i, "SampleID"] == j) { : argument is of length zero
What I was trying to do was to create a separate vector with unique IDs, and use that in a for function to go row by row to make sure that the row contains the sample with the same ID, and then try to retrieve the cell that contains the data for the sample with ID j AND is from WEEK 0 to calculate my fold change value. Any help on how to do this would be greatly appreciated! Thank you
May be, we could group by 'ID' and create a new column by dividing the 'readout' with the 'readout' where 'WEEK' is 0
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(new = readout/readout[WEEK == 0])
If the 'WEEK' is already orderd
df1 %>%
group_by(ID) %>%
mutate(new = readout/readout[1])
Or with data.table
library(data.table)
setDT(df1)[, new := readout/readout[WEEK == 0], by = ID]
If it is already ordered
setDT(df1)[, new := readout/readout[1], by = ID]
Or using base R
df1$new <- with(df1, readout/setNames(readout[WEEK == 0], unique(ID))[ID])
Regarding the console showing +
, it is just a symbol showing the expression is not complete
This we get in other consoles as well e.g. In Julia, the REPL would not show any symbol but it would give the output after the full expression is completed
df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
WEEK = c(0, 4, 14, 24, 0, 4, 0, 4, 14, 24), readout = c(5,
6, 7, 8, 1, 1.5, 1, 1, 5, 3)), class = "data.frame", row.names = c(NA,
-10L))