Search code examples
rcsvfinancial

Convert .csv file for further manipulation using 'highfrequency' package on R


The highfrequency package has been created in a way to transform .txt and .csv files from the NYSE TAQ and WRDS TAQ respectively into .RData files of xts objects, which then can be easily manipulated through the package.

The problem is that I have limited access to the WRDS database which only enables me to download tick-data from the CRSP (The Center for Research in Security Prices) database but not the TAQ (Trades and Quotes) database. So my data look like this. The downloadable file contains tick-data for the REIT index from 2014-01-01 to 2014-01-05. I changed manually the ticker header for the header PRICE as it is proposed by Kris Boudt, one of the main authors.

The code that I use is the following:

 from="2014-03-01"
 to="2014-04-31"
 datasource="C:/Users/aris/Desktop/raw_data"
 datadestination="C:/Users/aris/Desktop/xts_data"
 convert(from = from,to=to,datasource = datasource,datadestination = datadestination,
 trades=TRUE,quotes=FALSE,ticker="REIT",dir=FALSE,extension="csv",header = TRUE,
 tradecolnames = NULL, quotecolnames = NULL,format = "%Y%m%d %H:%M:%S",onefile=TRUE)

I suspect that the problem lies at the line format = "%Y%m%d %H:%M:%S", as at the .csv file the date and the time are comma separated. I tried to put a comma between %d and %H like this format = "%Y%m%d,%H:%M:%S" but nothing.

The error reads

 Error in `$<-.data.frame`(`*tmp*`, "COND", value = numeric(0)) :   
 replacement has 0 rows, data has 1048575

All the suggestions are welcomed.


Solution

  • Thanks to Joshua Ulrich I was able to gain some additional intuition and solve the problem(s). Actually, there is no need to manipulate the .csv file itself and add extra columns. Instead of setting tradecolnames = NULL you let the machine know which columns are contained into your file by setting tradecolnames = c("DATE","TIME","PRICE"). The problem with the non-existent directories is fixed by setting dir=TRUE . The final code looks like this:

    from="2014-03-01" 
    to="2014-04-31"
    datasource="C:/Users/aris/Desktop/raw_data"
    datadestination="C:/Users/aris/Desktop/xts_data" 
    convert(from,to,datasource,datadestination,trades=TRUE,quotes=FALSE,ticker="REIT",dir=TRUE,extension="csv",header= TRUE,tradecolnames=c("DATE","TIME","PRICE"),format = "%Y%m%d %H:%M:%S",onefile=TRUE)