I have a dataframe in the following long format:
I need to convert it into a list which should look something like this:
Wherein, each of the main element of the list would be the "Instance No." and its sub-elements should contain all its corresponding Parameter & Value pairs - in the format of "Parameter X" = "abc" as you can see in the second picture, listed one after the other.
Is there any existing function which can do this? I wasn't really able to find any. Any help would be really appreciated.
Thank you.
A dplyr solution
require(dplyr)
df_original <- data.frame("Instance No." = c(3,3,3,3,5,5,5,2,2,2,2),
"Parameter" = c("age", "workclass", "education", "occupation",
"age", "workclass", "education",
"age", "workclass", "education", "income"),
"Value" = c("Senior", "Private", "HS-grad", "Sales",
"Middle-aged", "Gov", "Hs-grad",
"Middle-aged", "Private", "Masters", "Large"),
check.names = FALSE)
# the split function requires a factor to use as the grouping variable.
# Param_Value will be the properly formated vector
df_modified <- mutate(df_original,
Param_Value = paste0(Parameter, "=", Value))
# drop the parameter and value columns now that the data is contained in Param_Value
df_modified <- select(df_modified,
`Instance No.`,
Param_Value)
# there is now a list containing dataframes with rows grouped by Instance No.
list_format <- split(df_modified,
df_modified$`Instance No.`)
# The Instance No. is still in each dataframe. Loop through each and strip the column.
list_simplified <- lapply(list_format,
select, -`Instance No.`)
# unlist the remaining Param_Value column and drop the names.
list_out <- lapply(list_simplified ,
unlist, use.names = F)
There should now be a list of vectors formatted as requested.
$`2`
[1] "age=Middle-aged" "workclass=Private" "education=Masters" "income=Large"
$`3`
[1] "age=Senior" "workclass=Private" "education=HS-grad" "occupation=Sales"
$`5`
[1] "age=Middle-aged" "workclass=Gov" "education=Hs-grad"
The posted data.table solution is faster, but I think this is a bit more understandable.