I am working with the dataset uscrime
but this question applied to any well-known dataset like cars
.
After to googling I found extremely useful to standardize my data, considering that PCA finds new directions based on covariance matrix of original variables, and covariance matrix is sensitive to standardization of variables.
Nevertheless, I found "It is not necessary to standardize the variables, if all the variables are in same scale."
To standardize the variable I am using the function:
z_uscrime <- (uscrime - mean(uscrime)) / sd(uscrime)
Prior to standardize my data, how to check if all the variables are in the same scale or not?
Proving my point that you can standardize your data however many times you want
library(tidyverse)
library(recipes)
#>
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stringr':
#>
#> fixed
#> The following object is masked from 'package:stats':
#>
#> step
simple_recipe <- recipe(mpg ~ .,data = mtcars) %>%
step_center(everything()) %>%
step_scale(everything())
mtcars2 <- simple_recipe %>%
prep() %>%
juice()
simple_recipe2 <- recipe(mpg ~ .,data = mtcars2) %>%
step_center(everything()) %>%
step_scale(everything())
mtcars3 <- simple_recipe2 %>%
prep() %>%
juice()
all.equal(mtcars2,mtcars3)
#> [1] TRUE
mtcars2 %>%
summarise(across(everything(),.fns = list(mean = ~ mean(.x),sd = ~sd(.x)))) %>%
pivot_longer(everything(),names_pattern = "(.*)_(.*)",names_to = c("stat", ".value"))
#> # A tibble: 11 x 3
#> stat mean sd
#> <chr> <dbl> <dbl>
#> 1 cyl -1.47e-17 1
#> 2 disp -9.08e-17 1
#> 3 hp 1.04e-17 1
#> 4 drat -2.92e-16 1
#> 5 wt 4.68e-17 1.00
#> 6 qsec 5.30e-16 1
#> 7 vs 6.94e-18 1.00
#> 8 am 4.51e-17 1
#> 9 gear -3.47e-18 1.00
#> 10 carb 3.17e-17 1.00
#> 11 mpg 7.11e-17 1
mtcars3 %>%
summarise(across(everything(),.fns = list(mean = ~ mean(.x),sd = ~sd(.x)))) %>%
pivot_longer(everything(),names_pattern = "(.*)_(.*)",names_to = c("stat", ".value"))
#> # A tibble: 11 x 3
#> stat mean sd
#> <chr> <dbl> <dbl>
#> 1 cyl -1.17e-17 1
#> 2 disp -1.95e-17 1
#> 3 hp 9.54e-18 1
#> 4 drat 1.17e-17 1
#> 5 wt 3.26e-17 1
#> 6 qsec 1.37e-17 1
#> 7 vs 4.16e-17 1
#> 8 am 4.51e-17 1
#> 9 gear 0. 1
#> 10 carb 2.60e-18 1
#> 11 mpg 4.77e-18 1
Created on 2020-06-07 by the reprex package (v0.3.0)