I am trying to find the first occurrence of a FALSE in a dataframe for each row value. My rows are specific occurrences and the columns are dates. I would like to be able to find the date of first FALSE so that I can use that value to find a return date.
An example structure of my dataframe:
df <- data.frame(ID = c(1,2,3), '2001' = c(TRUE, TRUE, TRUE),
'2002' = c(FALSE, TRUE, FALSE), '2003' = c(TRUE, FALSE, TRUE))
I want to end up with a second dataframe or list that contains the ID and the column name that identifies the first instance of a FALSE.
For example :
ID | Date
1 | 2002
2 | 2003
3 | 2002
I do not know the mechanism to find such a result.
The actual dataframe contains a couple thousand rows so I unfortunately can't do it by hand.
I am a new R user so please don't refrain from suggesting things you might expect a more experienced R user to have already thought about.
Thanks in advance
Try this using tidyverse
functions. You can reshape data to long and then filter for F
values. If there are some duplicated rows the second filter can avoid them. Here the code:
library(dplyr)
library(tidyr)
#Code
newdf <- df %>% pivot_longer(-ID) %>%
group_by(ID) %>%
filter(value==F) %>%
filter(!duplicated(value)) %>% select(-value) %>%
rename(Myname=name)
Output:
# A tibble: 3 x 2
# Groups: ID [3]
ID Myname
<dbl> <chr>
1 1 2002
2 2 2003
3 3 2002
Another option without duplicated values can be using the row_number()
to extract the first value (row_number()==1
):
library(dplyr)
library(tidyr)
#Code 2
newdf <- df %>% pivot_longer(-ID) %>%
group_by(ID) %>%
filter(value==F) %>%
mutate(V=ifelse(row_number()==1,1,0)) %>%
filter(V==1) %>%
select(-c(value,V)) %>% rename(Myname=name)
Output:
# A tibble: 3 x 2
# Groups: ID [3]
ID Myname
<dbl> <chr>
1 1 2002
2 2 2003
3 3 2002
Or using base R
with apply()
and a generic function:
#Code 3
out <- data.frame(df[,1,drop=F],Res=apply(df[,-1],1,function(x) names(x)[min(which(x==F))]))
Output:
ID Res
1 1 2002
2 2 2003
3 3 2002