Search code examples
rlistdataframenamed-entity-recognition

Append list with NULLs to data frame


I am using an NER library (entity) to extract person names from sentences in a data frame.

If I run:

library(entity)
dat <- data.frame(texts=c('Henry went home', 'Drive a car', 'Two snowmen'), stringsAsFactors=FALSE)
person_entity(dat$texts)

I get a list of extracted names:

> person_entity(dat$texts)
[[1]]
[1] "Henry"

[[2]]
NULL

[[3]]
NULL

How can I append this list as an additional column to my data frame? The additional column could be a list of the extracted names, or even just the length of the list, e.g.:

dat <- data.frame(texts=c('Henry went home', 'Drive a car', 'Two snowmen'), person_count=c(1,0,0), stringsAsFactors=FALSE)

Solution

  • One way would be to use lengths to get the length of individual elements in the list.

    dat$person_count <- lengths(person_entity(dat$texts))