I am new to R and I don't want to misunderstand the language and its data structure from the beginning on. :)
My data.frame sample.data
contains beside 'normal' attributes (e.g. author
) another, nested list of data.frame (files
), which has e.g. the attributes extension
.
How can I filter for authors who have created files with a certain extension? Is there a R-ic way of doing that? Maybe in this direction:
t <- subset(data, data$files[['extension']] > '.R')
Actually I want to avoid for loops.
Here you can find some sample data:
d1 <- data.frame(extension=c('.py', '.py', '.c++')) # and some other attributes
d2 <- data.frame(extension=c('.R', '.py')) # and some other attributes
sample.data <- data.frame(author=c('author_1', 'author_2'), files=I(list(d1, d2)))
The JSON the sample.data comes from looks like
[
{
"author": "author_1",
"files": [
{
"extension": ".py",
"path": "/a/path/somewhere/"
},
{
"extension": ".c++",
"path": "/a/path/somewhere/else/"
}, ...
]
}, ...
]
Interesting, not many people use R to simulate a hierarchical database!
subset(sample.data, sapply(files, function(df) any(df$extension == ".R")))