Given a dataframe as follows, how can I rescale v5
so that the mean
is 100
and the standard deviation
is 15
?
head(df, n=5)
Out:
v1 v2 v3 v4 v5
65 1 121.12 4 27
98 1 89.36 4 25
85 1 115.44 4 27
83 1 99.45 3 25
115 1 92.75 4 27
98 0 107.90 1 18
I have tried with psych
package but final df
is not correct for last column:
library(psych)
library(tidyverse)
v5.rescaled <- df %>% rescale(df$v5, mean = 100, sd = 15)
df$v5.rescaled
Out:
t.t.scale.x.....sd...mean.
121.11985
89.35994
115.43986
99.44991
92.74993
But head(df, n=5)
is not correct for rescaled v5
:
v1 v2 v3 v4 v5 v5.rescaled
1 65 1 121.12 4 27 <data.frame [5 × 1]>
2 98 1 89.36 4 25 <data.frame [5 × 1]>
3 85 1 115.44 4 27 <data.frame [5 × 1]>
4 83 1 99.45 3 25 <data.frame [5 × 1]>
5 115 1 92.75 4 27 <data.frame [5 × 1]>
v1
- v5
relates to the subsequent code chunk referring to df$mother.iq
. psych::rescale()
specifically states that the input, x
, should be a matrix or data frame. I suspect this is why the output you get is not what you were expecting. psych::rescale()
, a better alternative that offers more flexibility may be to forego the additional dependency on the {psych}
package altogether and, instead, simply manually rescale the columns as required. The two approaches are illustrated in the reprex below:# load libraries
library(tidyverse)
# define data as per OP
df <- data.frame(
v1 = c(65L, 98L, 85L, 83L, 115L, 98L),
v2 = c(1L, 1L, 1L, 1L, 1L, 0L),
v3 = c(121.12, 89.36, 115.44, 99.45, 92.75, 107.9),
v4 = c(4L, 4L, 4L, 3L, 4L, 1L),
v5 = c(27L, 25L, 27L, 25L, 27L, 18L)
)
# rescale via psych::rescale using entire data frame
df %>% psych::rescale(mean = 100, sd = 15)
#> v1 v2 v3 v4 v5
#> 1 77.38682 106.12372 119.90143 108.25723 109.31746
#> 2 106.46091 106.12372 82.24089 108.25723 100.71673
#> 3 95.00748 106.12372 113.16617 108.25723 109.31746
#> 4 93.24541 106.12372 94.20546 95.87139 100.71673
#> 5 121.43847 106.12372 86.26070 108.25723 109.31746
#> 6 106.46091 69.38138 104.22535 71.09970 70.61416
# if you only want to do this for specific columns, do it manually by targeting
# columns using dplyr::mutate_at(), an anonymous function, and scale (from base
# R):
df %>%
mutate_at(vars(v4, v5), function(x) scale(x)*15 + 100)
#> v1 v2 v3 v4 v5
#> 1 65 1 121.12 108.25723 109.31746
#> 2 98 1 89.36 108.25723 100.71673
#> 3 85 1 115.44 108.25723 109.31746
#> 4 83 1 99.45 95.87139 100.71673
#> 5 115 1 92.75 108.25723 109.31746
#> 6 98 0 107.90 71.09970 70.61416