The problem I'm having is that inner_join()
creates a new row with all the associated values.
An example:
zip_code <- c("1000", "1000", "1001")
village <- c("village_x", "village_y", "village_z")
villages <- data.frame(cbind(zip_code, village))
zip_code <- c("1000", "1000", "1001")
case <- c("case1", "case2", "case3")
cases <- data.frame(cbind(zip_code, case))
data <- inner_join(villages, cases, by="zip_code")
This solution increases the number of cases, as there are several villages with the same ZIP code.
How can I make it so that villages with the same ZIP code are in the same cell?
Or that the merge only pairs the cases with the first found value?
@ConnerSexton's solution worked:
data <- inner_join(villages, cases, by="zip_code") %>% group_by(zip_code, case) %>% summarize(village = paste(village, collapse = ', '), .groups = 'drop')
Thanks a lot!