I have a dataset below:
A B C D
500 2 4 6
501 6 8 45
502 4 7 9
How do I normalize every column excluding the first to be normalized and have a set standard deviation from the mean of each column.
So for example below are the means for each column:
B = 4
C = 6.333
D = 20
I then want to normalize with the bounds to be no greater than 25% of the mean in either direction.
I think you can do it with rescale but I just don't know how to apply it to all columns:
library(scales)
rescale(x, to = c(mean - 0.25*mean, mean + 0.25*mean)
I know this is a way to do it but it doesn't take into account the bounds and the standard deviation set of 25%:
normalized <- function(x){
return((x-min(x)) / (max(x)-min(x)))
}
normalized_dataset<-df %>%
mutate_at(vars(-one_of("A")), normalized)
I hope function rescale
comes from package scales
.
This is a typical example of the use of the *apply
family of functions.
I will work on a copy of the data and rescale the copy, if you don't want to keep the original, it's a simple matter to modify the code below.
dat2 <- dat
dat2[-1] <- lapply(dat2[-1], function(x)
scales::rescale(x, to = c(mean(x) - 0.25*mean(x), mean(x) + 0.25*mean(x))))
dat2
# A B C D
#1 500 3 4.750000 15.00000
#2 501 5 7.916667 25.00000
#3 502 4 7.125000 15.76923
Data.
dat <- read.table(text = "
A B C D
500 2 4 6
501 6 8 45
502 4 7 9
", header = TRUE)