Search code examples
rdataframematrixapplyrolling-computation

Error in rolling matrix computations on data frames using apply()


I want to compute w %*% Sigma %*% t(w) for each row in a particular dataframe, where w is a row vector in each row of my dataframe and Sigma is a covariance matrix.

Sigma changes for each line of the given dataframe. It is calculated based on the date given in column Reset_Date in my original dataframe. The covariance matrix works, it is calculated correctly in another part of the code. But it does not work inside the apply().

apply(df,1, function(x) as.matrix(x[,3:8])%*%
          cov(Dados[c((which(Dados$Date==x$Reset_Date)):((which(Dados$Date==x$Reset_Date))-63)),c(2:ncol(Dados))])%*%
          t(as.matrix(x[,3:8]))) 

There must be something wrong with my code, because even the simpler version of calculating the dot product fails. Somehow the result is a matrix, not one scalar for each line.

apply(df[,3:8],1, function(x) as.matrix(x)%*%t(as.matrix(x)))

Any ideas on how to fix either example? Here is some sample data (dataframes Dados and df):

Dados=structure(list(Date = structure(c(16905, 16906, 16909, 16910, 
                                        16911, 16913, 16916, 16917, 16918, 16919, 16920, 16923, 16924, 
                                        16925, 16926, 16927, 16930, 16931, 16932, 16933, 16934, 16937, 
                                        16938, 16939, 16940, 16941, 16944, 16945, 16946, 16948, 16951, 
                                        16952, 16953, 16954, 16955, 16958, 16959, 16960, 16961, 16962, 
                                        16965, 16966, 16967, 16968, 16969, 16972, 16973, 16974, 16975, 
                                        16976, 16979, 16980, 16981, 16982, 16983, 16986, 16987, 16988, 
                                        16989, 16990, 16993, 16994, 16995, 16996, 16997, 17000, 17001, 
                                        17002, 17003, 17004, 17007, 17008, 17009), class = "Date"), c(0.00960585109965884, 
                                                                                                      0.0812984880734113, 0.112781679522822, -0.0145184885929517, -0.011662089633413, 
                                                                                                      -0.0327527768959612, 0.0151436007187478, -0.0608315211005128, 
                                                                                                      0.040412688006497, -0.0879756056911929, 0.00317371425342383, 
                                                                                                      -0.0734061020879029, 0.0221495449282072, -0.031900508719751, 
                                                                                                      -0.00909652588569898, -0.0132414400054515, -0.0636499025520987, 
                                                                                                      0.0595155178929385, 0.0496966022372147, 0.0278539993905147, -0.0232250551043034, 
                                                                                                      -0.0315789056561488, -0.0571768253829652, -0.0511197886498449, 
                                                                                                      0.0489696431147379, 0.0178738652271138, -0.0506959500704562, 
                                                                                                      -0.00254709665075126, -0.0299114881227025, -0.0215640151840946, 
                                                                                                      0.107550799579403, 0.00145426529312154, 0.0676915429902003, 0.0211475594729338, 
                                                                                                      0.0643398963306341, 0.00496393940383477, -0.0526404298981098, 
                                                                                                      0.0380905791373509, -0.0891580558943561, -0.0645347544569685, 
                                                                                                      0.00570724634156861, -0.0412632543372649, -0.0208006548438888, 
                                                                                                      -0.0470550159555883, -0.0123849706687551, -0.0023477039443609, 
                                                                                                      0.0819777903542596, 0.00618852230414024, 0.022977414335057, 0.0280724753787043, 
                                                                                                      0.0551947631538763, -0.195678423483481, -0.0880383210535407, 
                                                                                                      -0.0841519282731906, 0.112873303891403, 0.0175334088901247, -0.045942677930122, 
                                                                                                      -0.0300476169326158, 0.00351042344834962, 0.0389348704162185, 
                                                                                                      0.00829105248665307, -0.00693793603785409, 0.0280590459741381, 
                                                                                                      -0.00540102675314902, -0.0239906439583271, 0.0163368695597521, 
                                                                                                      -0.00674931659354616, 0.0439348031396891, -0.153964767847314, 
                                                                                                      -0.031463293190126, 0.0069188458542202, -0.0416134046884897, 
                                                                                                      0.00408029958172129), c(-0.292209663232257, 0.103599520438458, 
                                                                                                                              0.274708851088623, -0.0557109374167419, -0.157957979992429, -0.0554170235205764, 
                                                                                                                              0.113431253533336, -0.0600003781444314, 0.162598822729643, -0.311927817466595, 
                                                                                                                              0.599523149849479, -0.103795741757395, -0.0776634595741843, 0.0269244302550042, 
                                                                                                                              0.217581466695527, -0.0322834660227755, -0.191377271191473, 0.300726294548759, 
                                                                                                                              0.319934388717669, 0.295157465398543, 0.00890284707404376, -0.0641387079437505, 
                                                                                                                              -0.0459754968033854, -0.289985711542395, 0.0722398906096711, 
                                                                                                                              -0.123006711056173, -0.490962832929132, -0.175032264765085, -0.12032435526016, 
                                                                                                                              -0.176141231274118, 0.11500605744208, -0.0923868671712014, 0.190425336380873, 
                                                                                                                              0.277445949794952, 0.406070025160576, 0.0952039477377431, -0.0576003722424168, 
                                                                                                                              0.140203114680371, 0.0197629475892525, -0.0752226248850074, 0.0693752560910754, 
                                                                                                                              0.0455597855569589, -0.102661342015731, -0.179332559157575, -0.11668358170327, 
                                                                                                                              -0.00276900524260881, -0.0112290596820275, -0.0493628339251595, 
                                                                                                                              0.078736802678181, -0.0479679813838452, 0.270953545649233, -0.587305172541941, 
                                                                                                                              -0.621844499591828, -0.166447303595396, 0.469400973311429, -0.120315498643209, 
                                                                                                                              -0.0846596216766748, 0.163283239995615, 0.0301963989209453, 0.135297886837837, 
                                                                                                                              -0.0210178896502544, -0.00776600620899792, 0.0639517026860714, 
                                                                                                                              0.0994514622426612, 0.0804093657452043, -0.0396069299241519, 
                                                                                                                              -0.17970264410071, -0.0720211595746423, -0.309769824186645, -0.0426919498663469, 
                                                                                                                              -0.0413861665933934, -0.0900483966791432, 0.041316666402369), 
                     c(-0.172923456722018, 0.134604714780284, -0.1560279435179, 
                       -0.129325171317674, -0.447953460277239, -0.250739954530577, 
                       -0.133060846029998, -0.200451934763202, 0.292155683756645, 
                       0.0755916721141059, 0.0991565124715654, -0.321059221302833, 
                       0.317447102566271, 0.0541416607111156, 0.16780010277575, 
                       -0.180523755295636, 0.0725824189477198, -0.0474727311916778, 
                       0.0660469446123413, -0.258791216526055, 0.207473892380716, 
                       -0.318984991642168, -0.0847417109554605, -0.714407448278065, 
                       0.201086896294655, -0.0629606726738619, -0.0346745144537142, 
                       -0.127772777391799, -0.14244413702964, 0.165634200853604, 
                       -0.0524648703481123, -0.00501155817775878, -0.107636307277348, 
                       0.158348029790401, 0.554012994905162, -0.152487582847627, 
                       0.0209494458093484, 5.21267740971609e-05, 0.122168764166197, 
                       0.175188759978351, 0.099845530350251, -0.0168977078334631, 
                       0.0635841364313494, 0.122308954985573, -0.321999765736414, 
                       -0.302960538847608, -0.195812633272907, -0.00867287475447842, 
                       -0.377386838928739, 0.844449416990134, 0.665042348450107, 
                       -0.0560141731879904, -0.118387030913281, -0.127366513074922, 
                       0.155434880032068, -0.0524563488240037, 0.533499307196594, 
                       -0.176858024007176, -0.0116683258824191, 0.0569655478134923, 
                       -0.401625823151397, -0.48236119067917, 0.208467649310506, 
                       -0.403011638845796, -0.380580334051539, 0.00128658083677902, 
                       0.091238844006214, -0.195045070922639, 0.0375062080320232, 
                       -0.0658825153824982, -0.0900340515372577, -0.000997745642150605, 
                       0.244937614122231), c(-0.496406493957458, 1.82907214245793, 
                                             0.687389929405513, -0.868168973766292, -0.306820341474912, 
                                             1.24806375771418, -0.785290365465008, -0.748408280952884, 
                                             -0.0439414578156772, -1.15325279172931, -1.14831330200486, 
                                             1.34556712459546, 2.14439654026597, -1.07129621359316, 0.168114328681135, 
                                             -1.25447291028965, 0.764345101139363, -1.72380844604273, 
                                             -0.864886480236537, 0.705645454523873, 1.21877970414914, 
                                             -0.453807426490982, -0.103915626802553, 1.66893226342728, 
                                             0.456333904610107, -1.31941645733865, 1.16841346197136, 0.196616476842348, 
                                             0.455618551464032, 0.225278111059102, -0.789225340102151, 
                                             1.1669151547465, -0.857419069552467, -0.46375835280883, -1.64309626794125, 
                                             -0.976897895566697, -1.33753794277297, -2.20380877689387, 
                                             0.76534622622666, 0.896877396532703, 1.14413915494875, 0.430897580220613, 
                                             -1.11470964937711, 0.416358159120978, -1.44082437967862, 
                                             -0.771037731706115, -0.346660169353696, -0.315088673574393, 
                                             -1.02879378992917, 0.658626287068875, 0.837519010675614, 
                                             -2.83708881346894, -2.06085441775258, -1.11120482833581, 
                                             0.406117230674652, 1.09657628991393, 1.16335336420141, 0.812029759062316, 
                                             0.744500956980687, -1.90823722791477, 0.275859124192412, 
                                             -0.603942162238758, -0.107292555769767, -1.17736558795127, 
                                             0.63200279455371, -0.453611159666423, 0.371833612317296, 
                                             -0.334115289868742, 0.758067227135695, -0.110330422599259, 
                                             0.142565898070357, -0.463040430759687, 0.069680935638794), 
                     c(-1.44252854343511, 1.50584329170067, -0.679304963566529, 
                       1.49017730045089, -0.199777580504445, -1.40063647843419, 
                       -2.02982723227936, 2.30147598638455, 2.5760586765625, -0.356830865331259, 
                       -0.791625843271082, -0.699779830023839, -2.48209778254743, 
                       0.507451023252337, -1.73031060711909, 0.0380590482016663, 
                       -1.45960291459708, 4.02840531052773, -0.629898888003211, 
                       0.851315865438118, -2.75153031138621, -0.0551356722263807, 
                       -1.9123482098551, -0.598772542428672, -0.901269186047482, 
                       -0.869854203811926, -0.8415157385536, -0.0225120534220302, 
                       0.226544641604676, -0.924225208065921, -0.230135314202118, 
                       -1.05856046752094, 1.06353418823462, 1.73196386595527, 1.41535767476162, 
                       -0.423274291874653, 0.0587189624158047, 2.20834035330906, 
                       -1.04189900598274, -3.37082319034527, 0.43036818352391, -2.09127622720844, 
                       0.495243057530081, 0.963348655276053, 0.194895155374741, 
                       1.55353832464475, 0.957778107172214, -1.39300452998466, 2.74583010854526, 
                       -2.87356974415359, -1.76832522655593, 1.49292893448334, 1.93799313574115, 
                       0.976947606342438, 1.31788691450525, 0.590087144719553, -1.4342449850373, 
                       0.062377260576274, 0.164986209055984, 2.11246557193216, 1.48942844632503, 
                       0.496632188894508, 0.577669475760856, 1.56402903813502, 0.123037811391691, 
                       1.57762064227671, 0.326145986443338, -0.264116111911472, 
                       0.0596585364206703, 0.584152800203808, -0.279382462032451, 
                       -0.210669397167951, 0.0709602555787558), c(-0.0351709286018131, 
                                                                  -0.150881086035359, 0.601645988249588, 0.255978688101188, 
                                                                  0.0236933704295117, -0.567090501196776, -0.233687453902298, 
                                                                  0.134822177205463, 0.112484065576579, -0.97553415256052, 
                                                                  -0.558778428576845, 0.728533991939528, -0.920135343412154, 
                                                                  -0.646147884809356, -0.0763445803970142, 0.264988189688498, 
                                                                  0.0228996249262536, 1.19589898895847, -1.00861894342493, 
                                                                  -0.0693899810052745, -0.900303057566632, 0.927216320396496, 
                                                                  -0.993604425464922, -0.031929111972917, -0.423137092074644, 
                                                                  0.549488194589487, -0.26102501486347, 1.31568433032141, 0.645025908357066, 
                                                                  0.355081396389956, -0.0524648703481123, -0.152505708458051, 
                                                                  0.060563779407552, 0.230018059927928, -0.343648012783104, 
                                                                  0.437258121587658, 0.0765049489658054, 0.27848497837073, 
                                                                  -0.224249429047296, -0.969974752705105, -0.863990602065634, 
                                                                  -0.232333372512006, -0.236531757215996, 0.260839873862939, 
                                                                  -0.378251169565813, 0.528342245508862, 0.218763512644027, 
                                                                  -0.217625279775147, 1.28393978636681, -3.64443676454954, 
                                                                  -1.86212791696754, 1.72457029956796, 1.65081887178389, 1.30403541749646, 
                                                                  0.14238381012821, -0.0524563488240037, -0.737204478591569, 
                                                                  0.482828504338562, -0.139597650373546, 1.47287231225834, 
                                                                  0.288402824021206, 0.648453062535115, -0.0389746600371721, 
                                                                  0.473448744260674, -0.145358403641205, 0.185774251899384, 
                                                                  -0.195979063698148, 0.374579453017043, -0.413718154457121, 
                                                                  0.402926153900052, -0.353606189676337, -0.0201758132887475, 
                                                                  -0.172336080005542)), class = "data.frame", row.names = c(NA, 
                                                                                                                            73L))


df=structure(list(Reset_Date = structure(c(16996, 16997, 17000, 
                                           17001, 17002, 17003, 17004, 17007, 17008, 17009), class = "Date"), 
                  structure(c(16997, 17000, 17001, 17002, 17003, 17004, 17007, 
                              17008, 17009, 17010), class = "Date"), c(1.82637665045793, 
                                                                       0.838634901792124, -0.322817945118891, 0.884944453234911, 
                                                                       0.410642820219065, 0.816526995015299, 1.94844363045823, 0.833011009391704, 
                                                                       1.68127046626407, 1.13013668001142), c(-0.183091416363673, 
                                -0.319511762676758, -0.0316593389910776, 0.259788486653649, 
                                 0.492005994872685, 0.201027165479703, 0.0939241784296569, 
                                  0.231492377659909, 0.340750751275363, 0.0524028691361074), 
                  c(0.231101296885914, 0.231101296885914, 0.000521792803768906, 
                    0.000521792803768906, 0.0757021972820923, -0.0207214984966843, 
                    -0.256874719990114, -0.0624916417149904, -0.299739460405469, 
                    -0.0599027654003357), c(-0.106863689224328, -0.106863689224328, 
                                            -0.146576498534379, -0.146576498534379, -0.146576498534379, 
                                            -0.146576498534379, -0.0876257469847474, -0.0546632235816992, 
                                            0.00454466736633395, -0.0546303193896886), c(0.110991660181017, 
                                                                                         0.152418298691739, 0.152418298691739, 0.136643097053495, 
                                                                                         0.162449054526432, 0.197965817610092, 0.155889024964584, 
                                                                                         0.113155738867012, 0.113155738867012, 0.0894339689144131), 
                  c(0.0731007117478923, 0.0508244737588497, -0.0236654986334738, 
                    -0.0984261218820228, -0.0984261218820228, -0.173090546127112, 
                    -0.247883779469496, -0.173170115673856, -0.0985569058107293, 
                    -0.0239349811169282)), row.names = c(NA, 10L), class = "data.frame")

Solution

  • First, let me help other readers better understand your problem (which you really failed to clarify).

    • Dados is a data frame of 6 variables observed in consecutive days;
    • df is a data frame of weights (w) for these variables on consecutive days.

    (Therefore, if df has N rows, Dados should have N + 63 rows.)

    For each date in df, compute:

    • a window covariance matrix Sigma using observations in a 64-day period to date;
    • a quadratic form using Sigma and w.

    The following is my attempt to this problem.

    ## observations in consecutive days
    date1 <- Dados$Date
    obs <- as.matrix(Dados[, -1])
    
    ## weights in consecutive days
    date2 <- df$Reset_Date
    wts <- as.matrix(df[, 3:8])
    
    ## compute the desired quadratic form for the i-th day in `date2`
    QuadForm <- function (i) {
      ## compute a window `Sigma`
      Sigma <- cov(obs[i:(i + 63), ])
      ## extract `w`
      w <- wts[i, ]
      ## compute the desired quadratic form
      c(crossprod(w, Sigma %*% w)) 
    }
    
    ## loop through days in `date2`
    sapply(1:length(date2), QuadForm)
    # [1] 0.08429908 0.10472934 0.10985758 0.09600481 0.12414915 0.13099761
    # [7] 0.08225033 0.04184155 0.05513798 0.03362151
    

    I was reluctant to fix your code, because doing row-wise operation on a data frame using apply is a bad practice and is not worth a fix. But if you insist on knowing how to fix it, read on.

    apply is designed for matrices or arrays. If you feed it a data frame, it converts the data frame to a matrix at run time. This implicit conversion can cause you trouble, when your data frame have different data types across its columns.

    ## columns 3 to 8 are all numeric
    ## you get a numeric matrix after conversion
    as.matrix(df[, 3:8])
    #1 1.8263767 -0.1830914 0.2311013 -0.1068637 0.1109917 0.07310071
    #2 0.8386349 -0.3195118 0.2311013 -0.1068637 0.1524183 0.05082447 
    
    ## you can sum numeric values
    apply(df[, 3:8], 1, sum)
    # 1.9516152  0.8466035 -0.3717792  1.0368952  0.8957974  ...
    
    ## dang!! columns 1 to 2 are strings; columns 3 to 8 are numeric
    ## you get a character matrix after conversion
    as.matrix(df)                                     
    #1 "2016-07-14" "2016-07-15" " 1.8263767" "-0.18309142" " 0.2311012969" ...
    #2 "2016-07-15" "2016-07-18" " 0.8386349" "-0.31951176" " 0.2311012969" ...
    
    ## dang!! you can't sum strings
    apply(df, 1, sum)
    #Error in FUN(newX[, i], ...) : invalid 'type' (character) of argument
    

    Whether you do row-wise (MARGIN = 1) or column-wise (MARGIN = 2) operations, the function fed to apply() must work with vectors only. Why? Because extracting a row or a column from a matrix gives you a vector without dimension.

    ## dang!! can not index an atomic vector by $
    apply(df, 1, function (x) substring(x$Reset_Date, 1, 4))
    #Error in x$Reset_Date : $ operator is invalid for atomic vectors
    
    ## OK
    apply(df, 1, function (x) substring(x[1], 1, 4)) 
    #"2016" "2016" "2016" "2016" "2016" ...
    
    ## dang!! can not index an atomic vector by [, 1:2] 
    apply(df[, 3:8], 1, function (x) sum(x[, 1:2]))
    #Error in x[, 1:2] : incorrect number of dimensions
    
    ## OK
    apply(df[, 3:8], 1, function (x) sum(x[1:2]))
    # 1.6432852  0.5191231 -0.3544773  1.1447329  0.9026488  ...
    

    In addition, a vector without dimension is treated as a column vector (in the math sense). So even if you think x is a row-vector, the dot product is in fact t(x) %*% x instead of x %*% t(x).

    ## dang!! not computing dot product
    apply(df[, 3:8], 1, function (x) x %*% t(x))
    
    ## this computes dot product
    apply(df[, 3:8], 1, function (x) t(x) %*% x)
    # 3.4516647 0.8960384 0.1504901 0.9004607 0.4739903 ...
    

    Fully understanding the above issues, your code should be fixed as

    ## as.matrix(df) is a string matrix
    ## so the input 'x' will be a string vector
    fun <- function (x) {
      ## 1. x[3:8], not x[, 3:8]
      ## 2. must as.numeric() or you have a string vector
      w <- as.numeric(x[3:8])
      pos <- which(Dados$Date == x[1])
      span <- pos:(pos - 63)
      Sigma <- cov(Dados[span, -1])
      ## not w %*% Sigma %*% t(w)
      t(w) %*% Sigma %*% w
    }
    
    apply(df, 1, fun)
    #0.08429908 0.10472934 0.10985758 0.09600481 0.12414914 0.13099762 0.08225033 
    #0.04184155 0.05513799 0.03362152 
    

    The result is identical to my proposed solution.

    To demonstrate how inefficient your (fixed) code is, let's do a benchmark.

    library(microbenchmark)
    microbenchmark("me" = sapply(1:length(date2), QuadForm),
                   "you" = apply(df, 1, fun))
    #Unit: microseconds
    # expr      min        lq      mean    median       uq      max neval
    #   me  521.704  542.1485  577.5259  574.4675  597.773  755.889   100
    #  you 6121.230 6151.9695 6412.4002 6197.5450 6237.770 9448.530   100