I have the following list df
in R:
structure(list(disease = structure(c(1L, 1L), .Label = "Barcelona", class = "factor"),
`<18` = structure(list(0.193103448275862,
0.0445344129554656), .Names = c(NA_character_, NA_character_
)), `19-25` = structure(list(0.0413793103448276,
0.345748987854251), .Names = c(NA_character_, NA_character_
)), `26-64` = structure(list(0.448275862068966, 0.167611336032389), .Names = c(NA_character_,
NA_character_)), `46-64` = structure(list(0.0344827586206897,
0.00647773279352227), .Names = c(NA_character_, NA_character_
)), `>65` = structure(list(0.282758620689655,
0.435627530364373), .Names = c(NA_character_, NA_character_
)), type = structure(1:2, .Label = c("Clinical Trial", "Real-World"
), class = "factor")), class = "data.frame", row.names = c(NA,
-2L))
I want to rearrange the dataframe, so that I can get each value by city, flat and agegroup using melt
. However, I get an extra column as an output:
melt(df)
city type variable value NA
1 Barcelona flat <18 0.19310345 0.044534413
2 Barcelona house <18 0.19310345 0.044534413
3 Barcelona flat 19 - 25 0.04137931 0.345748988
4 Barcelona house 19 - 25 0.04137931 0.345748988
5 Barcelona flat 26 - 45 0.44827586 0.167611336
6 Barcelona house 26 - 45 0.44827586 0.167611336
7 Barcelona flat 46 - 64 0.03448276 0.006477733
8 Barcelona house 46 - 64 0.03448276 0.006477733
9 Barcelona flat > 65 0.28275862 0.435627530
10 Barcelona house > 65 0.28275862 0.435627530
Is there any way of not having the NA
column and getting unique values in the value
column?
The issue is that your measure columns are list
class, not numeric
class. If we convert them to numeric, melt
will work fine. (I show one way to do it, but it might be better to go earlier in your workstream and prevent the columns from being created as lists in the first place... this is definitely what you should do if my code that works on your sample data runs into issues on the larger data. tidyr::unnest
may be able to help in that case.)
sapply(df, class)
# disease <18 19-25 26-64 46-64 >65 type
# "factor" "list" "list" "list" "list" "list" "factor"
list_cols = sapply(df, is.list)
df[list_cols] = lapply(df[list_cols], unlist)
reshape2::melt(df, id.vars = c("disease", "type"))
# disease type variable value
# 1 Barcelona Clinical Trial <18 0.193103448
# 2 Barcelona Real-World <18 0.044534413
# 3 Barcelona Clinical Trial 19-25 0.041379310
# 4 Barcelona Real-World 19-25 0.345748988
# 5 Barcelona Clinical Trial 26-64 0.448275862
# 6 Barcelona Real-World 26-64 0.167611336
# 7 Barcelona Clinical Trial 46-64 0.034482759
# 8 Barcelona Real-World 46-64 0.006477733
# 9 Barcelona Clinical Trial >65 0.282758621
# 10 Barcelona Real-World >65 0.435627530