Search code examples

Nested, hierarchical data frames in R

I am new to R and I don't want to misunderstand the language and its data structure from the beginning on. :)

My data.frame contains beside 'normal' attributes (e.g. author) another, nested list of data.frame (files), which has e.g. the attributes extension.

How can I filter for authors who have created files with a certain extension? Is there a R-ic way of doing that? Maybe in this direction:

t <- subset(data, data$files[['extension']] > '.R')

Actually I want to avoid for loops.

Here you can find some sample data:

d1 <- data.frame(extension=c('.py', '.py', '.c++')) # and some other attributes
d2 <- data.frame(extension=c('.R', '.py')) # and some other attributes <- data.frame(author=c('author_1', 'author_2'), files=I(list(d1, d2)))

The JSON the comes from looks like

        "author": "author_1",
        "files": [
                "extension": ".py",
                "path": "/a/path/somewhere/"
                "extension": ".c++",
                "path": "/a/path/somewhere/else/"
            }, ...
    }, ...


  • Interesting, not many people use R to simulate a hierarchical database!

    subset(, sapply(files, function(df) any(df$extension == ".R")))