Relist rows of a data table in R following the structure of a desired list?

I am loading a csv file into a data table in R that looks like the below:

Name	Location	Dates.Start	Dates.End	...
John	Virtual	12/12/2022	13/12/2022	...
Anne	In person	05/01/2023	06/01/2023	...
Ben	Hybrid	10/01/2023	11/01/2023	...

I need to iterate over each row and use it as a list with the following structure:

row_list = list(Name = 'John' ,
                Location = 'Virtual',
                Dates = data.frame(Start = '12/12/2022',
                                   End = '13/12/2022'),
                ...
               )

I have tried converting each row to a list and I can do it by iterating over all rows and creating the required entities.

for (row in nrow(dt)){
  row_list = list(Name = dt$Name[row] ,
                Location = dt$Location[row],
                Dates = data.frame(Start = dt$Dates.Start[row],
                                   End = dt$Dates.End[row]),
                ...
               )
}

The problem is that I have multiple columns in the original data table and I would prefer a relist command that would automatically identify and merge the columns that need to be stores as a data frame. If I use the list() command the columns won't merge in a data frame (i.e. the 'Dates' data frame). Is there a way to merge the two 'Dates' into one data frame while creating the list without needing to call each value separately?

Solution

In this solution, first I prepare a list of columns toDf that must be bound in a data.frame. Then I get the list of other columns by difference (setdiff). The two sets then are merged into the Cols list.

library(purrr)

toDf <- list(Dates = c("Dates.Start", "Dates.End"))
otherCols <- setdiff(names(dt), reduce(toDf, c))
Cols <- c(setNames(otherCols, otherCols), toDf)

Cols

$Name
[1] "Name"

$Location
[1] "Location"

$Dates
[1] "Dates.Start" "Dates.End"

Then by using asplit the data.frame is split by row and each row is processed according to elements of Cols. When there is an element containing two or more column names, the corresponding data is merged into a data.frame

asplit(dt,1) |>
    map(~{
        data <- as.list(.x)
        map(Cols, ~if(length(.x) == 1)
                       data[[.x]]
                   else
                       data.frame(data[.x]))
    })
    
##> [[1]]
##> [[1]]$Name
##> [1] "John"
##> 
##> [[1]]$Location
##> [1] "Virtual"
##> 
##> [[1]]$Dates
##>   Dates.Start  Dates.End
##> 1  12/12/2022 13/12/2022
##> 
##> 
##> [[2]]
##> [[2]]$Name
##> [1] "Anne"
##> 
##> [[2]]$Location
##> [1] "In person"
##> 
##> [[2]]$Dates
##>   Dates.Start  Dates.End
##> 1  05/01/2023 06/01/2023
##> 
##> 
##> [[3]]
##> [[3]]$Name
##> [1] "Ben"
##> 
##> [[3]]$Location
##> [1] "Hybrid"
##> 
##> [[3]]$Dates
##>   Dates.Start  Dates.End
##> 1  10/01/2023 11/01/2023

An alternative equivalent by column solution can be:

map(Cols, ~if(length(.x) == 1)
               dt[[.x]]
           else
               split(dt[.x], ~ I(1:nrow(dt)))) |>
    transpose()