Search code examples
rxtsquantitative-finance

to.hourly adding open and close columns


I have a data.frame object containing OHLC data:

head(data,3)
Timestamp           Open   High   Low    Close   Vol
2016-02-05 13:45:00 1161.9 1162.4 1161.7 1161.8  592
2016-02-05 13:50:00 1161.8 1163.2 1161.7 1162.5  643
2016-02-05 13:55:00 1162.5 1164.7 1162.1 1164.5 1072

I then create another data.frame extracting the High and Low cols:

x <- data[,c("High","Low")]

Which gives:

head(x,3)
Timestamp           High   Low
2016-02-05 13:45:00 1162.4 1161.7
2016-02-05 13:50:00 1163.2 1161.7
2016-02-05 13:55:00 1164.7 1162.1

And then convert to hourly:

x <- xts::to.hourly(x, indexAt='startof')  

Which somehow adds back on the "Open" and "Close" columns even though they did not exist in "x":

head(x,3)
Timestamp           x.Open x.High x.Low   x.Close
2016-02-05 13:45:00 1162.4 1164.7 1162.4  1164.7
2016-02-05 14:00:00 1167.2 1176.7 1167.1  1176.7
2016-02-05 15:00:00 1176.3 1176.3 1174.9  1176.2

The values in the Open and Close columns are as if they came from data rather than x, but how did it get these values when I did not pass data into that function?

Obviously there is an easy work around here which is to remove (again) the Open and Close columns post the to.hourly function, but is this expected behavior, or am I missing something really simple?


Solution

  • The output makes sense as expected behaviour. You're reducing 5 min bars to hourly bars. to.hourly will try to make OHLC time series at a lower frequency given your inputs, not just the hourly HL time series.

    to.hourly is a wrapper to to.period in package xts. As per the documentation for to.period:

    Convert an OHLC or univariate object to a specified periodicity lower than the given data object. For example, convert a daily series to a monthly series, or a monthly series to a yearly one, or a one minute series to an hourly series.

    The result will contain the open and close for the given period, as well as the maximum and minimum over the new period, reflected in the new high and low, respectively.

    You haven't shown the 5 min bars beyond 13:55:00, so it's not obvious whether the 1167.2 Open on hourly bars at 14:00:00 makes sense, since you don't know, working with just HL data, whether the High or low price came first (to use as a proxy for the Open price). You'd have to look at the source code to see what approximations are being done in generating those hourly bar open prices (which couldn't logically be known to be right or wrong given just HL bar data). If you're working with intraday bar data, it helps to know at least the HLC, not just HL.