Update: I should have been clearer that I was trying to check out the Enhanced functionality in reshaping using data.tables https://cran.r-project.org/web/packages/data.table/vignettes/datatable-reshape.html. Updated the title.
I have this data set with two sets of variables - Credit_Risk_Capital and Name_concentration. They are calculated per 2 methodologies - New and Old. When I melt them using the data.table package, the variable names default to 1 and 2. How can I change them to just Credit_Risk_Capital and Name_Concentration.
Here is the data set
df <-data.table (id = c(1:100),Credit_risk_Capital_old= rnorm(100, mean = 400, sd = 60),
NameConcentration_old= rnorm(100, mean = 100, sd = 10),
Credit_risk_Capital_New =rnorm(100, mean = 200, sd = 10),
NameConcentration_New = rnorm(100, mean = 40, sd = 10))
old <- c('Credit_risk_Capital_old','NameConcentration_old')
new<-c('Credit_risk_Capital_New','NameConcentration_New')
t1<-melt(df, measure.vars = list(old,new), variable.name = "CapitalChargeType",value.name = c("old","new"))
Now instead of the elements in the CapitalChargeType Column getting tagged as 1's and 2's, I want them to be changed to Credit_risk_Capital and NameConcentration. I can obviously change them in a subsequent step using a 'match' function, but is there anyway I can do it within melt itself.
I'm not sure about using melt
, but here's a way using tidyr
Note that I changed the variable name to use a .
instead of _
to separate the name for the old
/new
. This makes it easier to separate the name into two variables since there are already many underscores.
library(tidyr)
df <- dplyr::data_frame(
id = c(1:100),
Credit_risk_Capital.old= rnorm(100, mean = 400, sd = 60),
NameConcentration.old= rnorm(100, mean = 100, sd = 10),
Credit_risk_Capital.new =rnorm(100, mean = 200, sd = 10),
NameConcentration.new = rnorm(100, mean = 40, sd = 10)
)
df %>%
gather("key", "value", -id) %>%
separate(key, c("CapitalChargeType", "new_old"), sep = "\\.") %>%
spread(new_old, value)
#> # A tibble: 200 x 4
#> id CapitalChargeType new old
#> * <int> <chr> <dbl> <dbl>
#> 1 1 Credit_risk_Capital 182.10955 405.78530
#> 2 1 NameConcentration 42.21037 99.44172
#> 3 2 Credit_risk_Capital 184.28810 370.14308
#> 4 2 NameConcentration 60.92340 120.13933
#> 5 3 Credit_risk_Capital 191.07982 389.50818
#> 6 3 NameConcentration 25.81776 90.91502
#> 7 4 Credit_risk_Capital 193.64247 327.56853
#> 8 4 NameConcentration 32.71050 94.95743
#> 9 5 Credit_risk_Capital 208.63547 286.59351
#> 10 5 NameConcentration 40.76064 116.52747
#> # ... with 190 more rows