Search code examples
rregressionfinance

How to regress all rows with the same date on a value?


I'm new using R. I have the following sample dataset:

> head(abn)
       Dates  DTM   YTM
1 2010-09-28 1133 2.965
2 2010-09-28 1834 3.613
3 2010-09-29 1132 2.994
4 2010-09-29 1833 3.595
5 2010-09-30 1131 3.026
6 2010-09-30 1832 3.590

The observations are several bond values on an observation period from 2010-2016. My data set is composed of multiple bonds with maturities between 1-15 years (260-3900 business days as depicted in the dataset). DTM stands for days to maturity and YTM for yield to maturity.

My goal is to construct a synthetic bond with a maturity of 5 years for each day. Therefore I need to make a regression and find the YTM value for the DTM value of 1300, which is exactly 5 years.

I need to get the value of the y-axis at x=1300. However I need to have this information for every date separately.

I got help and the person got me this code:

library(dplyr) newval <- data.frame(DTM=1300) #predict.lm likes new values in a dataframe abn5y <- abn %>% group_by(Dates) %>% summarise(Y5=predict(lm(YTM ~ DTM), newval))

This worked. However I loaded the next data set.

head(bmp))
   Dates   DTM   YTM
  <dttm> <dbl> <dbl>

1 2007-11-02 1498 4.782 2 2007-11-02 1892 4.883 3 2007-11-02 1300 4.934 4 2007-11-05 1497 4.768 5 2007-11-05 1891 4.880 6 2007-11-05 1299 4.924'

And used the same code and got the following errors, with different attempts.

bmp5y <- bmp %>% group_by(Dates) %>% + + summarise(Y5=predict(lm(YTM ~ DTM), newval)) Error in eval(predvars, data, env) : object 'YTM' not found

bmp5y <- bmp %>% group_by(dates) %>% + summarise(Y5=predict(lm(ytm ~ dtm), newval)) Error in grouped_df_impl(data, unname(vars), drop) : Column dates is unknown

bmp5y <- bmp %>% group_by(Dates) %>% + summarise(Y5=predict(lm(ytm ~ dtm), newval)) Error in summarise_impl(.data, dots) : Column Y5 must be length 1 (a summary value), not 6563 In addition: Warning message: 'newdata' had 1 row but variables found have 6563 rows

What seems to be the problem?


Solution

  • It is not clear from the question precisely what code and data is being used but to reconstruct it in a reproducible and verifiable manner, copy and paste the code below to a fresh R session -- it runs without any error messages for me:

    Lines <- "
          Dates   DTM   YTM
    1 2007-11-02 1498 4.782 
    2 2007-11-02 1892 4.883 
    3 2007-11-02 1300 4.934 
    4 2007-11-05 1497 4.768 
    5 2007-11-05 1891 4.880 
    6 2007-11-05 1299 4.924"  
    bmp <- read.table(text = Lines)
    
    library(dplyr)
    newval <- data.frame(DTM=1300)
    bmp %>% group_by(Dates) %>% summarise(Y5=predict(lm(YTM ~ DTM), newval))
    

    giving:

    # A tibble: 2 x 2
           Dates       Y5
          <fctr>    <dbl>
    1 2007-11-02 4.876237
    2 2007-11-05 4.863499