rpairwise.wilcox.test# R wilcoxon test on groups

I would like to perform a wilcoxon test on a paired sample and I am wondering, if my code is correct for what I would like to test. I want to know if there is a significant difference between my dependent variable mean moisture (=Feuchte) and my independend variable distance (=Transtyp) grouped by kettlehole (Soll). The hypothesis is, that with increasing distance there is a significant decrease in moisture for each kettlehole.

This is my dataframe

```
df <- structure(list(Datum = structure(c(18703, 18703, 18703, 18703,
18724, 18724, 18724, 18724, 18730, 18730, 18730, 18730, 18744,
18744, 18744, 18744, 18758, 18758, 18758, 18758, 18774, 18774,
18774, 18774), class = "Date"), Soll = c("1192", "1192", "149",
"149", "1192", "1192", "149", "149", "1192", "1192", "149", "149",
"1192", "1192", "149", "149", "1192", "1192", "149", "149", "1192",
"1192", "149", "149"), Transtyp = structure(c(1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L), .Label = c("2", "5"), class = "factor"), Feuchte = c(36.15,
36.6518518518519, 37.66, 37.8310344827586, 28.7625, 30.128125,
27.271875, 23.0645161290323, 31.903125, 32.15625, 31.740625,
29.9875, 14.6290322580645, 14.6516129032258, 15.058064516129,
13.159375, 13.675, 13.7896551724138, 12.390625, 9.690625, 16.2586206896552,
17.441935483871, 24.24375, 20.24375)), row.names = c(NA, -24L
), class = c("tbl_df", "tbl", "data.frame"))
```

This is my code so far:

```
df %>% ungroup() %>%
split(.$Soll)%>%
map_df( ~broom::tidy(wilcox.test(Feuchte ~ Transtyp, data = .x, paired = T, )), .id = "Soll")
```

Am I really testing what I want to test as described above? The results are confusing to me. Also, I know you can also use a "," instead of "~". What is the difference between those two and which one do I need and why? I am really stuck and I cant find a good explanation. Thanks a lot in advance!

Cheers

Solution

Yes, it appears you are performing the calculation correctly. When to use the ~ versus the , is dependent on what form your data is in.

In your example above, your data frame has 1 column of dependent values (Feuchte) and a column of independent variables (Transtyp) so the formula style is correct "y ~ x" (y as a function of x).

On the other hand if you have 2 separate vectors of data then you need to use the , format (y1 compared to y2).

To demonstrate using your example:

```
df %>% ungroup() %>%
split(.$Soll)%>%
map_df( ~broom::tidy(wilcox.test(Feuchte ~ Transtyp, data = .x, paired = T, )), .id = "Soll")
# A tibble: 2 × 5
# Soll statistic p.value method alternative
# <chr> <dbl> <dbl> <chr> <chr>
#1 1192 0 0.0313 Wilcoxon signed rank exact test two.sided
#2 149 20 0.0625 Wilcoxon signed rank exact test two.sided
```

Now extracting Transtyp==2 and Transtyp==5 from when Sol=1192:

```
sg<-df %>% ungroup() %>% split(.$Soll)
wilcox.test(sg$`1192`$Feuchte[sg$`1192`$Transtyp==2], sg$`1192`$Feuchte[sg$`1192`$Transtyp==5], paired = TRUE)
# Wilcoxon signed rank exact test
#
#data: sg$`1192`$Feuchte[sg$`1192`$Transtyp == 2] and sg$`1192`$Feuchte[sg$`1192`$Transtyp == 5]
#V = 0, p-value = 0.03125
#alternative hypothesis: true location shift is not equal to 0
```

As you can see the V=0 and value =0.0313 in both cases for Soll==1192.

- Installing R on Linux: configure: error: libcurl >= 7.28.0 library and headers are required with support for https
- How to do ensembles with time series using AICc?
- planes3d expands and draws the area based on the sphere's radius
- How to extract tag code itself using R, rvest
- How to Display or Print Contents of Environment in R
- How to use Windows user credentials for proxy authentication in R/RStudio
- R reticulate specifying python executable to use
- Replace multiple Instances of a variable name in an R function and save the modified function
- Standardizing address formatting in R
- How to fix "failed to load cairo DLL" in R?
- Using grepl to filter columns names in specific range of columns
- changing the legends in ggplot2 to have groups of similar labels
- How to keep only unique rows but ignore a column?
- convert string date to R Date FAST for all dates
- Add subgroup text to plotly pie chart
- R Shiny : adjust height of DT datatable when fillContainer=TRUE,
- Why do R external pointers' "unusual copying semantics" mean they should not be used stand-alone?
- How to extract somo character after a string with a number of word which can change in R
- What does `se` stand for in geom_smooth(..., se = FALSE)?
- How to find number of rows greater than any values in R
- Align text and reduce space between text and parentheses in plotly hover info box
- Remove outer box of geom_bar plot with broken y-axis
- How to use lag/lead in mutate with an initial value?
- Is it possible to have a Shiny ConditionalPanel whose condition is a global variable?
- counting elements in one list in another list
- How to vectorize nested loops in R?
- Replace NA values with an incrementing sequence starting from the previous non-NA value
- How can I calculate the number of uniques in a row within a species matrix?
- How to perform operations on pairs of rows, based on a "distinguishing" column's values
- Mutate variable based on previous observations