I am trying to convert from long format to wide with following code.
data_ige<-read.csv("serology.csv",header = TRUE,na.strings=0)
library(tidyverse)
library(magrittr)
data_new <- data_ige %>% spread(test, value)
I have following dataset existing dataset
After running the code, it transforms the date (but not in the way i want) as seen in the next image, items highlighted in yellow show that these values are coming in multiple rows but they should be in first row instead of a new row. each patient had data either for 1 visit or 2 visit. so all test results from 1 visit, i want to see them in one row and test results from visit 2 in second row.
This screen shot shows the desired outcome.
We need to create a sequence column as there are duplicates
library(dplyr)
library(tidyr)
data_ige %>%
group_by(ID, date, test) %>%
mutate(rn = row_number()) %>%
ungroup %>%
spread(test, value) %>%
#or use pivot_wider as spread is getting deprecated
# pivot_wider(names_from = test, values_from = value) %>%
select(-rn)
# A tibble: 8 x 9
# ID date `1` `3` `4` `5` `6` `7` `8`
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A 2008 0.035 NA NA NA NA NA NA
#2 A 2011 2.75 NA NA NA NA NA NA
#3 B 2011 9.99 3.65 0.68 0.02 0.17 0.5 NA
#4 C 2008 0 NA NA NA NA NA NA
#5 C 2011 NA NA NA NA NA NA 0.09
#6 D 2008 0 0 0 0 0 0.59 0
#7 D 2011 0 0.49 0.2 0.08 0.16 0.5 0.13
#8 D 2011 9.99 NA NA NA NA NA NA
data_ige <- structure(list(ID = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L), .Label = c("A", "B", "C", "D"), class = "factor"), date = c(2008,
2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2008, 2011, 2008,
2008, 2008, 2008, 2008, 2008, 2008, 2011, 2011, 2011, 2011, 2011,
2011, 2011), test = c(1, 1, 1, 3, 4, 5, 6, 7, 8, 1, 1, 1, 3,
4, 5, 6, 7, 8, 1, 3, 4, 5, 6, 7, 8), value = c(0.035, 2.75, 9.99,
3.65, 0.68, 0.02, 0.17, 0.5, 0.09, 0, 0, 0, 0, 0, 0, 0, 0.59,
0, 9.99, 0.49, 0.2, 0.08, 0.16, 0.5, 0.13)),
class = "data.frame", row.names = c(NA,
-25L))