I am using WikipediR to query revision ids to see if the very next edit is a 'rollback' or an 'undo'
I am interested in the tag and revision comment to identify if the edit was undone/rolled back. my code for this for a single revision id is:
library(WikipediR)
wp_diff<- revision_diff("en", "wikipedia", revisions = "883987486", properties = c("tags", "comment"), direction = "next", clean_response = T, as_wikitext=T)
I then convert the output of this to a df using the code
library(dplyr)
library(tibble)
diff <- do.call(rbind, lapply(wp_diff, as.data.frame, stringasFactors=FALSE))
This works great for a single revision id. I am wondering how I would loop or map over a vector of many revision ID's
I tried
vec <- c("883987486","911412795")
for (i in 1:length(vec)){
wp_diff[i]<- revision_diff("en", "wikipedia", revisions = i, properties = c("tags", "comment"), direction = "next", clean_response = T, as_wikitext=T)
}
But this creates the error Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0
When I try to convert the output list to a dataframe. Does anybody have any suggestions. I am not sure how to proceed.
Thanks.
Try the following code:
# Make a function
make_diff_df <- function(rev){
wp_diff <- revision_diff("en", "wikipedia", revisions = rev,
properties = c("tags", "comment"),
direction = "next", clean_response = TRUE,
as_wikitext = TRUE)
DF <- do.call(rbind, lapply(wp_diff, as.data.frame, stringasFactors=FALSE))
# Define the names of the DF
names(DF) <- c("pageid","ns","title","revisions.diff.from",
"revisions.diff.to","revisions.diff..",
"revisions.comment","revisions..mw.rollback.")
return(DF)
}
vec <- c("883987486","911412795")
# Use do.call and lapply with the function
do.call("rbind",lapply(vec,make_diff_df))
Note that you have to fixed the names of the DF
inside the make_diff_df
function in order to "rbind"
inside do.call
could work. The names with the 2 versions from the example are pretty similar.
Hope this can help