Search code examples
rdatatabledata-manipulationcalculated-columns

How to change columns in a datable based on a specific line of data


I have bilateral exchanges data, such as :

library(datatable)

mwe <- data.table(Importer=c("Country_A","Country_A","Country_A","Country_A",
                             "Country_B","Country_B","Country_B","Country_B",
                             "World","World","World","World",
                             "Country_C","Country_C","Country_C","Country_C"),
                  Short_Importer=c("A","A","A","A",
                                   "B","B","B","B",
                                   "W","W","W","W",
                                   "C","C","C","C"),
                  Exporter=c("Country_A", "Country_B", "World", "Country_C",
                             "Country_A", "Country_B", "World", "Country_C",
                             "Country_A", "Country_B", "World", "Country_C",
                             "Country_A", "Country_B", "World", "Country_C"),
                  Value=c(0,12,36,24,
                          10,0,44,34,
                          30,22,110,58,
                          20,10,30,0))

I have changed it in a wider format

mwe_wide <- dcast(mwe, Importer + Short_Importer ~ Exporter, value.var = "Value")

I would like this datable, but with shares in columns instead of values. I would therefore like to simply replace columns 3 to 5 by the same values divided by the amount on the same column for the line world. I assume it is not very complicated but have not found a satisfying way to do it. I have in reality several subregions, so I would like to avoid deleting the line world and then divising by the sum.

desired_output <- data.table(Importer=c("Country_A", "Country_B", "Country_C", "World"),
                             Short_Importer=c("A","B","C","W"),
                             Country_A =c(0,0.33,0.66,1),
                             Country_B =c(0,0.55,0.45,1),
                             Country_C =c(0,0.41,0.59,1),
                             World =c(0.33,0.40,0.27,1))

Solution

  • If we need to divide by the last row where the 'Importer' is 'World'

    mwe_wide[, (3:6) := lapply(.SD, function(x) 
               round(x/x[Importer == "World"],  2)), .SDcols = 3:6] 
    mwe_wide
    #    Importer Short_Importer Country_A Country_B Country_C World
    #1: Country_A              A      0.00      0.55      0.41  0.33
    #2: Country_B              B      0.33      0.00      0.59  0.40
    #3: Country_C              C      0.67      0.45      0.00  0.27
    #4:     World              W      1.00      1.00      1.00  1.00
    

    Or with Map

    mwe_wide[, (3:6) := Map(`/`, .SD, .SD[Importer == 'World']), .SDcols = 3:6]