My data is:
X0 X1 X2 X3 category
0 15 4 4 TAH
0 2 5 0 MAT
0 11 9 0 BIO
I want to calculate row-wise normality, skewness and kurtosis. The main reason is that I have categories over different rows (in a dedicated column). Is there a function that can achieve this functionality?
I have been trying to do this using the moments
package and the dplyr
package, similar to this post:
Function that calculates, mean, variance and skewness at the same time in a dataframe.
But their solution is column wise not row wise.
df3 %>%
gather(category, Val) %>%
group_by(category) %>%
summarise(Mean = mean(Val),
Vari = var(Val),
Skew = skewness(Val))
For normality, I have tried the following command separately for each row:
shapiro.test(df3[1,])
Any help on this would be greatly appreciated.
You can use rowwise
-
library(dplyr)
library(tidyr)
df %>%
rowwise() %>%
mutate(Mean = mean(c_across(X0:X3)),
Vari = var(c_across(X0:X3)),
Shap = shapiro.test(c_across(X0:X3))$p.value,
Skew = moments::skewness(c_across(X0:X3))) %>%
ungroup
# X0 X1 X2 X3 category Mean Vari Shap Skew
# <int> <int> <int> <int> <chr> <dbl> <dbl> <dbl> <dbl>
#1 0 15 4 4 TAH 5.75 41.583 0.232 0.84778
#2 0 2 5 0 MAT 1.75 5.5833 0.220 0.68925
#3 0 11 9 0 BIO 5 34 0.110 0.058244
Similar to your attempt you may get the data in long format and calculate the statistics for each category
(rowwise).
df %>%
pivot_longer(cols = -category) %>%
group_by(category) %>%
summarise(Mean = mean(value),
Vari = var(value),
Skew = moments::skewness(value))