I would like to spread the date column into new columns (Date 1, Date 2, Date 3, etc.) for each observation in chronological order. I would also like to add a column with the corresponding result My data looks like this:
> CV
Name Accession MRN Collected Result
1 Doe, John 123 55555 2022-01-05 Detected
2 Doe, John 234 55555 2022-01-06 Negative
3 Doe, John 345 55555 2022-01-07 Detected
4 Doe, Jane 456 66666 2022-01-08 Negative
5 Doe, Jane 567 66666 2022-01-09 Negative
6 Doe, Jane 678 66666 2022-01-20 Negative
I would like it to look like this
Name MRN Date_1 Result_1 Date_2 Result_2 Date_3 Result_3
Doe, John 55555 2022-01-05 Detected 2022-01-06 Negative 2022-01-07 Detected
Doe, Jane 66666 2022-01-08 Negative 2022-01-09 Negative 2022-01-20 Negative
I think I may need to use the accession number as a key but I'm not really sure how to perform this?
Name <- c("Doe, John", "Doe, John", "Doe, John", "Doe, Jane",
"Doe, Jane", "Doe, Jane")
Accession <- c(123, 234, 345, 456, 567, 678)
MRN <- c(55555, 55555, 55555, 66666, 66666, 66666)
Collected <- c("2022-01-05", "2022-01-06", "2022-01-07", "2022-01-08",
"2022-01-09", "2022-01-20")
Result <- c("Detected", "Negative", "Detected", "Negative",
"Negative", "Negative")
CV <- data.frame(Name, Accession, MRN, Collected, Result)
You could replace the Accession by a running id (1, 2, 3, ...) and use reshape
.
> CV |>
+ transform(Accession=ave(Name, Name, FUN=seq_along)) |>
+ reshape(idvar=c('Name', 'MRN'), timevar='Accession', direction='wide')
Name MRN Collected.1 Result.1 Collected.2 Result.2 Collected.3 Result.3
1 Doe, John 55555 2022-01-05 Detected 2022-01-06 Negative 2022-01-07 Detected
4 Doe, Jane 66666 2022-01-08 Negative 2022-01-09 Negative 2022-01-20 Negative
If you want the column to be named Date instead of Collected add a line that does the cosmetics.
> CV |>
+ transform(Accession=ave(Name, Name, FUN=seq_along)) |>
+ reshape(idvar=c('Name', 'MRN'), timevar='Accession', direction='wide') |>
+ {\(.) setNames(., sub('Collected', 'Date', names(.)))}()
Name MRN Date.1 Result.1 Date.2 Result.2 Date.3 Result.3
1 Doe, John 55555 2022-01-05 Detected 2022-01-06 Negative 2022-01-07 Detected
4 Doe, Jane 66666 2022-01-08 Negative 2022-01-09 Negative 2022-01-20 Negative
Data:
> dput(CV)
structure(list(Name = c("Doe, John", "Doe, John", "Doe, John",
"Doe, Jane", "Doe, Jane", "Doe, Jane"), Accession = c(123, 234,
345, 456, 567, 678), MRN = c(55555, 55555, 55555, 66666, 66666,
66666), Collected = c("2022-01-05", "2022-01-06", "2022-01-07",
"2022-01-08", "2022-01-09", "2022-01-20"), Result = c("Detected",
"Negative", "Detected", "Negative", "Negative", "Negative")), class = "data.frame", row.names = c(NA,
-6L))