I have a dataset that shows the logins by date and unique_session_code for each user that visits the webpage of a bakery. I am trying to come a code that allows me to determine if this is the first, second, third and so on... visit of this customer in other words the "nth_visit" and I am really looking for a solution using the tidyverse package, but if there is no way to do that using tidyverse then all other solutions are just as appreciated. thanks for the help guys!
USER | LOG_DATE | TX_REV |
---|---|---|
ALEX | 30/04/2021 | uyjd |
MARY | 30/04/2021 | dn89 |
DOLLY | 30/04/2021 | nqmd0 |
FRANK | 29/04/2021 | nsj456 |
MARY | 29/04/2021 | zlpa2 |
DOLLY | 29/04/2021 | s09dm |
ALEX | 28/04/2021 | jds02 |
MARY | 28/04/2021 | kqos98 |
DOLLY | 28/04/2021 | uiskk0 |
This would be my desired result:
USER | LOG_DATE | TX_REV | nth_visit |
---|---|---|---|
ALEX | 28/04/2021 | jds02 | 1 |
ALEX | 30/04/2021 | uyjd | 2 |
DOLLY | 28/04/2021 | uiskk0 | 1 |
DOLLY | 29/04/2021 | s09dm | 2 |
DOLLY | 30/04/2021 | nqmd0 | 3 |
FRANK | 29/04/2021 | nsj456 | 1 |
MARY | 28/04/2021 | kqos98 | 1 |
MARY | 29/04/2021 | zlpa2 | 2 |
MARY | 30/04/2021 | dn89 | 3 |
A data frame that has the "nth_visit" to the website by each attempt of the visitor
What have I done?
I have tried to arrange the date in chronological order and group by ID, and reviewing some documentation I tried to use the function rank_dense but it does not give me the results that I am looking for..not even close.
Thank you so much guys
code:
df<-tibble::tribble(
~USER, ~LOG_DATE, ~TX_REV,
"ALEX", "30/04/2021", "uyjd",
"MARY", "30/04/2021", "dn89",
"DOLLY", "30/04/2021", "nqmd0",
"FRANK", "29/04/2021", "nsj456",
"MARY", "29/04/2021", "zlpa2",
"DOLLY", "29/04/2021", "s09dm",
"ALEX", "28/04/2021", "jds02",
"MARY", "28/04/2021", "kqos98",
"DOLLY", "28/04/2021", "uiskk0")
Sorting by date will help, otherwise the visit numbering will be in order of appearance in the data rather than chronological.
df %>%
mutate(LOG_DATE_DT = as.Date(LOG_DATE, "%d/%m/%Y")) %>%
arrange(USER, LOG_DATE_DT) %>%
group_by(USER) %>%
mutate(nth_visit = row_number()) %>%
ungroup()
# A tibble: 9 x 5
USER LOG_DATE TX_REV LOG_DATE_DT nth_visit
<chr> <chr> <chr> <date> <int>
1 ALEX 28/04/2021 jds02 2021-04-28 1
2 ALEX 30/04/2021 uyjd 2021-04-30 2
3 DOLLY 28/04/2021 uiskk0 2021-04-28 1
4 DOLLY 29/04/2021 s09dm 2021-04-29 2
5 DOLLY 30/04/2021 nqmd0 2021-04-30 3
6 FRANK 29/04/2021 nsj456 2021-04-29 1
7 MARY 28/04/2021 kqos98 2021-04-28 1
8 MARY 29/04/2021 zlpa2 2021-04-29 2
9 MARY 30/04/2021 dn89 2021-04-30 3