I have a csv file, and I want to extract the each column a as string so I can use it with getSymbols
function from quantmod
package.
The csv file looks like this:
AEGR,Aegerion Pharmaceuticals Inc
AKS,AK Steel Holding Corp
ALXA,Alexza Pharmaceuticals Inc
CCL,Carnival Corporation
CECO,Career Education Corp
CDXS,Codexis Inc
And I use this code to read the file:
data<-read.csv(file='CAPM/allquotes.csv',header=F)
symbols=gettext(data[,1])
symbol.names=gettext(data[,2])
getSymbols(symbols)
I get this error:
Error in download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open URL 'http://chart.yahoo.com/table.csv?s=ALXA&a=0&b=01&c=2007&d=5&e=16&f=2012&g=d&q=q&y=0&z=ALXA&x=.csv'
In addition: Warning message:
In download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open: HTTP status was '404 Not Found'
When I enter the symbols one by one it works fine. I've also noticed that when I go to the end of the last line, the margins seem to corrupt. In the image you can see that values of 'symbols', the end of the line is a few more spaces to the right than it should be (you can see that because of the color of the initial parenthesis).
Your csv has hidden characters in it -- namely a left-to-right mark. Since you are using RStudio, you can remove it with gsub
using "\016" as the value for the pattern
argument. Alternatively, instead of removing the hidden character that you don't want, you could only keep the characters that you know you DO want. For example, if your symbols will only have letters and/or numbers you could use something like gsub("[^A-Za-z0-9]", "", data[, 1])
data <- read.csv(text="AEGR,Aegerion Pharmaceuticals Inc
AKS,AK Steel Holding Corp
ALXA,Alexza Pharmaceuticals Inc
CCL,Carnival Corporation
CECO,Career Education Corp
CDXS,Codexis Inc", header=FALSE)
#data[, 1] <- gsub("\016", "", data[, 1]) #this should work in RStudio
data[, 1] <- gsub("[^A-Za-z0-9]", "", data[, 1]) #but this should work anywhere
symbols=gettext(data[,1])
getSymbols(symbols, src='yahoo')
After you read.csv
, you can examine the data
object to see that something is amiss.
s <- as.character(data[, 1])
str(s)
#chr [1:6] "AEGR" "AKS" "ALXA""| __truncated__ "CCL""| __truncated__ "CECO""| __truncated__ "CDXS""| __truncated__
str(s[3])
#chr "ALXA""| __truncated__
charToRaw(s[3])
#[1] 41 4c 58 41 e2 80 8e
# Compare what we have to what we think we have
charToRaw("ALXA")
#[1] 41 4c 58 41