Search code examples
rdata-manipulationtidyrspread

R: Use vector as key parameter in tidyr spread function


I am trying to use tidyr spread function, except I want to pass in my own vector of feature names to be used for the key parameter.

For example, the default usage would be

test<-data.frame(id=c(1,1,2,2),
             feat=c("feat1", "feat2", "feat1", "feat2"),
             value = c(10,20, 1000, 2000))
test %>% spread(key = feat, value = value, fill = 0)
  id feat1 feat2
1  1    10    20
2  2  1000  2000

I would like to pass in my own vector of feature strings to be used as the key, something like this.

featlist<-c("feat1", "feat2", "feat3")
test %>% spread(key = featlist, value = value, fill = 0)
#desired output
  id feat1 feat2 feat3
1  1    10    20     0
2  2  1000  2000     0
#Error output
Error: `var` must evaluate to a single number or a column name, not a character vector
#Trying spread_
test %>% spread_(key = featlist, value = "value", fill = 0)
Error: Only strings can be converted to symbols

Solution

  • Just make that the feat column a factor with levels set to featlist then set the drop parameter to FALSE as in:

    test<-data.frame(id=c(1,1,2,2),
                     feat=c("feat1", "feat2", "feat1", "feat2"),
                     value = c(10,20, 1000, 2000))
    
    featlist<-c("feat1", "feat2", "feat3")
    test$feat <- factor(test$feat, levels = featlist)
    
    test %>% spread(key = feat, value = value, fill = 0, drop = FALSE)
    

    Which results in:

      id feat1 feat2 feat3
    1  1    10    20     0
    2  2  1000  2000     0