I need a little help polishing up my code. I am trying to run a wilcox rank sum test and am using the following code to do it:
Program_A <- c(95,78,88,84,89,83,79,85,74,81,77,82)
Program_B <- c(91,93,83,98,86,95,99,100,94,107,92,102,105,103,87)
n1 <- length(Program_A)
n2 <- length(Program_B)
#make dataframe
Program_data <- data.frame(
sections = c(rep("Program_A", n1),
rep("Program_B", n2)),
scores = c(Program_A, Program_B)
)
Program_data
#carry out function
Program_data1 <- Program_data %>%
mutate(
score_rank = rank(scores)
) %>%
group_by(sections) %>%
summarise(test_stat = sum(score_rank))
Program_data1
# sections test_stat
# <chr> <dbl>
# 1 Program_A 94
# 2 Program_B 284
Tx <- 94 #using the smallest value
n1
n2
z <- (Tx - (n1*(n1+n2+1))/2)/sqrt((n1*n2*(n1+n2+1))/12)
z
This will work as long as Program_A has a shorter length.
However, what I'd like to to now is to find a way to test the lengths of Program_A and Program_B to test which is bigger if the length of the numbers should change.
Ex: Program_A <- c(95,78,88,84) Program_B <- c(91,93,83,98,86,95)
I would like a way to test which variable is longer, get the value of each length and assign in such a way that n1 will always have the value of the shorter length variable, and n2 will always have the value of the longer length variable.
Thanks, DM
We can also do
l1 <- lengths(list(Program_A, Program_B))
n1 <- min(l1)
n2 <- max(l1)