Search code examples
rgoogle-trends

GTrendsR search Error in charToDate(x) when trying to for loop over search terms


I've experienced an error on GTrendsR package which other examples on StackOverflow don't deal with, that is how to loop through several searches using for or lapply functionality.

WHen I do sth simple like gtrends(ch, query = "Harvard University" , geo = "US")

I've gotten an error that doesn't occur with a do a simple search on one keyword.

Error in charToDate(x) : character string is not in a standard unambiguous format

from lapply(queries, function(x) gtrends(ch, query = x , geo = "US"))

and

for (i in seq_along(queries)) {
      x <- queries[i]
      dta[i,] <-  gtrends(ch, query = x , geo = "US")$trend   # trend data.frame returned from gtrends()
}

In case background and code are needed: I'm trying to get Google Trends search history for US college names listed in IPEDS (at this US DofEd API link)

I'm using GTrendR package at

devtools::install_bitbucket(repo = "gtrendsr", username="persican")

Doing single search terms is fine. But as soon as I try to automate, I get GTrendsR error.

library("GTrendsR", lib.loc="~/Library/R/3.1/library")

download.file("https://inventory.data.gov/dataset/032e19b4-5a90-41dc-83ff-6e4cd234f565/resource/38625c3d-5388-4c16-a30f-d105432553a4/download/postscndryunivsrvy2013dirinfo.csv" , destfile="ipeds.csv", method="curl")

colleges <- read.csv("./ipeds.csv", header=T, stringsAsFactors=F)

queries <- colleges$INSTNM  # Institution Names

prepopulating dataframe with 3 columns from gtrends function

dta <-data.frame(matrix(NA, length(queries),3)) 

set credentials

usr <- "[email protected]"
psw <- "yourpassword"
ch <- gconnect(usr, psw)

For loop to automate:

for (i in seq_along(queries)) {
      x <- queries[i]
      dta[i,] <-  gtrends(ch, query = x , geo = "US")$trend   # trend data.frame returned from gtrends()
}

lapply doesn't work either:

lapply(queries, function(x) gtrends(ch, query = x , geo = "US")$trend)

I get this error:

Error in charToDate(x) : character string is not in a standard unambiguous format

The error seems to be due to dependency on a charToDate() I can't seem how to get to.

However, when I use just 3 searches it works:

three <- list("Harvard University", "Boston College", "Bard College")

out <- sapply(three, function(x)  cbind.data.frame(gtrends(ch, query = x , geo = "US")$trend[3])[])

Solution

  • This is because URLs/browsers become angry when there're spaces. Problem: you have search phrases with spaces. The error message is not that helpful here but @Richard asked a question that got me thinking on the right track.

    So you are passing terms with spaces but google wants + rather than spaces. gsub to the rescue.

    x <- "one two three"
    gsub("\\s+", "\\+", x)
    
    ## [1] "one+two+three"
    

    So now applied to the problem...Also I threw a try in there to deal with errors you may get. This will return a list of data frames.

    colleges <- read.csv("ipeds.csv", header=TRUE, stringsAsFactors=FALSE)
    queries <- colleges[["INSTNM"]]
    dta <- data.frame(matrix(NA, length(queries),3)) 
    
    usr <- "[email protected]"
    psw <- "password"
    ch <- gconnect(usr, psw)
    
    output <- lapply(queries, function(x) {
        x <- gsub("\\s+", "\\+", gsub("[-,]", " ", x))
        out <- try(gtrends(ch, query = x , geo = "US")[["trend"]])
        if (inherits(out, "try-error")) return(NULL)
        out
    })