Search code examples
rreddit

Extract from Reddit works fine but cannot save as excel file


I'm using pushshiftr to extract Reddit posts and works fine

install.packages("devtools")
devtools::install_github("whereofonecannotspeak/pushshiftr")
library(pushshiftr)
p<-ps_search_submissions(NA, subreddit = "disability", after = "2019-12-26", before = "2019-12-27")

from here:

https://github.com/dashstander/pushshiftr

In p now I have the posts but when I try to save into excel using this

write_xlsx(p, "C:/Users/Reddit/posts.xlsx")

I get the error:

Argument x must be a data frame or list of data frames

The argument in this case is a list, but couldn't find out how to export to excel


Solution

  • If we need a single dataset,

    library(purrr)
    out <- map_dfr(p,  
       ~ .x[c('author', 'body', 'created_utc', 'score')] %>% stack, .id = 'grp')
    

    Or with base R

    out <- type.convert(as.data.frame(do.call(rbind, 
      lapply(p, function(x) do.call(c, x[c('author', 'body', 
            'created_utc', 'score')])))), as.is = TRUE)