I've experienced an error on GTrendsR package which other examples on StackOverflow don't deal with, that is how to loop through several searches using for or lapply functionality.
WHen I do sth simple like gtrends(ch, query = "Harvard University" , geo = "US")
I've gotten an error that doesn't occur with a do a simple search on one keyword.
Error in charToDate(x) : character string is not in a standard unambiguous format
from lapply(queries, function(x) gtrends(ch, query = x , geo = "US"))
and
for (i in seq_along(queries)) {
x <- queries[i]
dta[i,] <- gtrends(ch, query = x , geo = "US")$trend # trend data.frame returned from gtrends()
}
In case background and code are needed: I'm trying to get Google Trends search history for US college names listed in IPEDS (at this US DofEd API link)
I'm using GTrendR package at
devtools::install_bitbucket(repo = "gtrendsr", username="persican")
Doing single search terms is fine. But as soon as I try to automate, I get GTrendsR error.
library("GTrendsR", lib.loc="~/Library/R/3.1/library")
download.file("https://inventory.data.gov/dataset/032e19b4-5a90-41dc-83ff-6e4cd234f565/resource/38625c3d-5388-4c16-a30f-d105432553a4/download/postscndryunivsrvy2013dirinfo.csv" , destfile="ipeds.csv", method="curl")
colleges <- read.csv("./ipeds.csv", header=T, stringsAsFactors=F)
queries <- colleges$INSTNM # Institution Names
prepopulating dataframe with 3 columns from gtrends function
dta <-data.frame(matrix(NA, length(queries),3))
set credentials
usr <- "[email protected]"
psw <- "yourpassword"
ch <- gconnect(usr, psw)
For loop to automate:
for (i in seq_along(queries)) {
x <- queries[i]
dta[i,] <- gtrends(ch, query = x , geo = "US")$trend # trend data.frame returned from gtrends()
}
lapply doesn't work either:
lapply(queries, function(x) gtrends(ch, query = x , geo = "US")$trend)
I get this error:
Error in charToDate(x) : character string is not in a standard unambiguous format
The error seems to be due to dependency on a charToDate() I can't seem how to get to.
However, when I use just 3 searches it works:
three <- list("Harvard University", "Boston College", "Bard College")
out <- sapply(three, function(x) cbind.data.frame(gtrends(ch, query = x , geo = "US")$trend[3])[])
This is because URLs/browsers become angry when there're spaces. Problem: you have search phrases with spaces. The error message is not that helpful here but @Richard asked a question that got me thinking on the right track.
So you are passing terms with spaces but google wants +
rather than spaces. gsub
to the rescue.
x <- "one two three"
gsub("\\s+", "\\+", x)
## [1] "one+two+three"
So now applied to the problem...Also I threw a try
in there to deal with errors you may get. This will return a list of data frames.
colleges <- read.csv("ipeds.csv", header=TRUE, stringsAsFactors=FALSE)
queries <- colleges[["INSTNM"]]
dta <- data.frame(matrix(NA, length(queries),3))
usr <- "[email protected]"
psw <- "password"
ch <- gconnect(usr, psw)
output <- lapply(queries, function(x) {
x <- gsub("\\s+", "\\+", gsub("[-,]", " ", x))
out <- try(gtrends(ch, query = x , geo = "US")[["trend"]])
if (inherits(out, "try-error")) return(NULL)
out
})