Search code examples
rdataframemultiplicationcalculated-columns

Conditional multiplying based on column name


I already checked this solution but it doesn't fit my data structure.

Input

I have a data frame Portfolio containing company indentifiers Indentifier, portfolio weights Weight, Dates and in the rest of the columns return data at least for each of the Indentifiers:

# Portfolio

            Indentifier      Weight       Date AIR.PA SCHN.PA ASML.AS  TCH.PA ERICb.ST SASY.PA  RE.PA TELIA.ST NOKIA.HE DAIGn.DE ALSO.PA ...

1               AIR.PA  0.007671609 2003-04-30 0.2164 -0.0112  0.3008  0.2772   0.4247  0.1592 0.2780   0.2138   0.1975   0.1351  0.2908
2               SCHN.PA 0.004302912 2003-05-30 0.1420 -0.0482  0.0904  0.1135   0.0729  0.0168 0.0210   0.0640   0.0092  -0.0744  0.8791
3               SCHN.PA 0.004302912 2003-06-30 0.1317  0.0533 -0.0201 -0.0588   0.0683 -0.0402 0.2177   0.0782  -0.0627   0.1364 -0.1228
4               AIR.PA  0.007671609 2003-06-30 0.3311  0.1661  0.4075  0.0424   0.3646 -0.0176 0.0738   0.0155  -0.0502   0.0556  0.0333
5               AIR.PA  0.007671609 2003-07-31 0.0188  0.0302  0.2328  0.1836   0.0923  0.0220 0.1104  -0.0257   0.0954   0.0810 -0.2774
...

I want to multiply the Indentifiers' Weight for a Date with the return stored in the column with the same column name as the Indentifier.

Example

For row 1 with the Indentifier AIR.PA, with the Date 2003-04-30, the calculation would be: Weight 0.007671609 * return data 0.2164 = 0.001660.

Ideally the output in Portfolio for AIR.PA, SCHN.PA, and ASML.AS would be the following:

# Portfolio

        Date  AIR.PA           SCHN.PA     ASML.AS  ...
        
1 2003-04-30  0.001660              NA          NA
2 2003-05-30        NA       -0.000207          NA
3 2003-06-30  0.002540        0.000229          NA
4 2003-07-31  0.000144              NA          NA   
...

Thanks in advance for the help! Best regards, Thomas.


Solution

  • Should be straight forward like so:

    
    dat %>%
        pivot_longer( cols=-c(Identifier,Weight,Date) ) %>%
        mutate( score = Weight * value ) %>%
        filter( Identifier == name ) %>%
        pivot_wider( names_from=name, values_from=score ) %>%
        select( -c(Identifier,Weight,value) ) 
        
    

    Applied to your data:

    
    dat <- read.table(text=
    "            Identifier      Weight       Date AIR.PA SCHN.PA ASML.AS  TCH.PA ERICb.ST SASY.PA  RE.PA TELIA.ST NOKIA.HE DAIGn.DE ALSO.PA
    1               AIR.PA  0.007671609 2003-04-30 0.2164 -0.0112  0.3008  0.2772   0.4247  0.1592 0.2780   0.2138   0.1975   0.1351  0.2908
    2               SCHN.PA 0.004302912 2003-05-30 0.1420 -0.0482  0.0904  0.1135   0.0729  0.0168 0.0210   0.0640   0.0092  -0.0744  0.8791
    3               SCHN.PA 0.004302912 2003-06-30 0.1317  0.0533 -0.0201 -0.0588   0.0683 -0.0402 0.2177   0.0782  -0.0627   0.1364 -0.1228
    4               AIR.PA  0.007671609 2003-06-30 0.3311  0.1661  0.4075  0.0424   0.3646 -0.0176 0.0738   0.0155  -0.0502   0.0556  0.0333
    5               AIR.PA  0.007671609 2003-07-31 0.0188  0.0302  0.2328  0.1836   0.0923  0.0220 0.1104  -0.0257   0.0954   0.0810 -0.2774
    ", header=TRUE ) %>% mutate( Date = as.Date(Date) )
    
    dat %>%
        pivot_longer( cols=-c(Identifier,Weight,Date) ) %>%
        mutate( score = Weight * value ) %>%
        filter( Identifier == name ) %>%
        pivot_wider( names_from=name, values_from=score ) %>%
        select( -c(Identifier,Weight,value) )
    
    

    Produces:

    
    # A tibble: 5 x 3
      Date          AIR.PA   SCHN.PA
    * <date>         <dbl>     <dbl>
    1 2003-04-30  0.00166  NA       
    2 2003-05-30 NA        -0.000207
    3 2003-06-30 NA         0.000229
    4 2003-06-30  0.00254  NA       
    5 2003-07-31  0.000144 NA