Search code examples
rlistdictionaryhashr-factor

R dataframe to "dictionary" avoiding list of factors


I have a dataframe df with two columns, one containing names and the second one the values which can be strings or doubles, for example

> df
       name   value
1  cat_name    Bart
2   cat_age       5
3  dog_name    Fred
4   dog_age       9
5 total_pet       2

I'd like to convert df into a list of named objects so I can call list$cat_name and get back a string "Bart" or list$bird_age and get back 1 as a numeric.

I've tried

> list <- split(df[, 2], df[, 1])
> list
$cat_age
[1] 5
Levels: 2 5 9 Bart Fred

$cat_name
[1] Bart
Levels: 2 5 9 Bart Fred

$dog_age
[1] 9
Levels: 2 5 9 Bart Fred

$dog_name
[1] Fred
Levels: 2 5 9 Bart Fred

$total_pet
[1] 2
Levels: 2 5 9 Bart Fred

which transforms df into a list of factors. It's nearly what I want because the $ operator works fine. However, I'm not really used to be working with factors and I'd like to know if there was another dataframe-to-list transformation available out there. The annoying part comes from the fact that in order to work with strings and numbers we must convert the factors back to those types

> as.character(list$cat_name)
[1] "Bart"
> as.numeric(as.character(list$total_pet))
[1] 3

After noticing that df[, 1] and df[, 2] are actually factors I've tried using

> list <- split(as.character(df[, 2]), df[, 1])
> list
$cat_age
[1] "5"

$cat_name
[1] "Bart"

$dog_age
[1] "9"

$dog_name
[1] "Fred"

$total_pet
[1] "2"

which nearly solves the problem except that numbers are characters to be converted later. I've also tried using hash objects

> h <- hash(as.vector(df[, 1]), as.vector(df[, 2]))
> l = as.list(h)
> l
$dog_age
[1] "9"

$dog_name
[1] "Fred"

$cat_age
[1] "5"

$total_pet
[1] "2"

$cat_name
[1] "Bart"

but I have the same result.

Does anyone have advice ? Am I missing something obvious ?

Tanks :)


Solution

  • We can do this with type.convert

    library(purrr)
    map(list, type.convert, as.is = TRUE)
    #$cat_age
    #[1] 5
    
    #$cat_name
    #[1] "Bart"
    
    #$dog_age
    #[1] 9
    
    #$dog_name
    #[1] "Fred"
    
    #$total_pet
    #[1] 2
    

    As this could be more efficient by implementing parallelly, one option is future_map from furrr

    library(furrr)
    plan(multiprocess)
    future_map(list, type.convert, as.is = TRUE)