I want to download search query data from google trends for both japanese and english search terms. It works perfectly fine when I use english search terms only, but it does not work as soon as I include japanese letters.
My code is the following(I included the default keyword just for this example to make it easier to use):
URL_GT=function(keyword="Toyota Aygo %2B Toyota Yaris %2B Toyota Vitz %2B
トヨタヴィッツ", year=2010, month=1, length=68){
start="http://www.google.com/trends/trendsReport?hl=en-US&q="
end="&cmpt=q&content=1&export=1"
date=""
queries=keyword[1]
if(length(keyword)>1) {
for(i in 2:length(keyword)){
queries=paste(queries, "%2C ", keyword[i], sep="")
}
}
#Dates
if(!is.na(year)){
date="&date="
date=paste(date, month, "%2F", year, " ", month+length-1, "m", sep="")
}
URL=paste(start, queries, date, end, sep="")
browseURL(URL)
}
When I look at the download URL that gets called in my browser I can see that the japanese letters got transformed into some %, numbers and letters, but they are not supposed to change at all.
When I use
Sys.setlocale("LC_CTYPE","japanese_JAPAN")
I get the following paste result
paste("トヨタヴィッツ","Toyota Vitz", sep = "")
[1] "ƒgƒˆƒ^ƒ”ƒBƒbƒcToyota Vitz"
I think this shows pretty good that the paste() function seems not to work as intended.
Using
Sys.setlocale("LC_CTYPE","german_GERMANY")
I get following error message
unexpected INCOMPLETE_STRING
1: URL_GT=function(keyword="Toyota Aygo %2B Toyota Yaris %2B Toyota Vitz %2B ?
indicating that R cannot interpret the japanese letters.
I tried finding a solution, but could only find tips which led me to change my locale. As discribed above this did not work for me so far. I also found this tip, but I got the same error as the enquirer of that question - namely
Warning message: In Sys.setlocale("LC_CTYPE", "UTF-8") : OS reports request
to set locale to "UTF-8" cannot be honored
I am very grateful for any help! Since this is my first post ever I hope that everything concerning structure and detail is alright.
I found a solution that works just fine for me. I had to change the language for unicode-incompatible programs in order for the japanese local to work properly.
On Windows 8.1 you have to go to the control panel, time, region & language, region, administration and there you can change the language accordingly - in my case japanese - restart your pc afterwards.
If you now set your local to
Sys.setlocale("LC_CTYPE","japanese_JAPAN")
typing in paste should return what you asked for e.g.
paste("It works", "トヨタヴィッツ", sep=" ")
[1] "It works トヨタヴィッツ"
The only thing that still confuses me is that when I open the Excel file after the download the Japanese letters appear in a new criptic way.
I tried downloading the data for the word manually and get the same result in the Excel file. So I guess the data should be the correct one. Unfortunately I did not download a CSV file of the japanese data before I changed my unicode language to see if excel messed it up there as well. But when I restored my settings to german again the same criptic letters appeared in the downloaded file.