rpairwise.wilcox.testgrouped-list# Is there a way to apply wilcoxon test grouped by site?

I want to use Wilcoxon 2-sided test for two treatments across multiple groups, i.e. there is a before and after treatment (Conc) for each of several sample sites. I want to split the dataset into a list by Site then apply the test so i can have an output for each Site individually, however, i am having trouble setting this up as a function that can repeat.

I have a number of sites (Site) and two levels of treatment (Scenario), with resulting scores (Conc):

```
'data.frame': 7344 obs. of 6 variables:
$ Site : chr "A" "B" "C" "D" ...
$ Scenario : chr "1" "1" "1" "1" "2" "2" "2" "2" ...
$ Conc : num 4.7727 0.055 0.0552 0.055 0.055 ...
```

there are multiple Conc data points (~60) within each Site/Scenario combination. The reason i chose a Wilcoxon test is primarily because i have slightly uneven sample numbers between treatments (Scenario) for each Site.

When i use this code for the entire dataset i get a sensible result:

```
t1 <- wilcox.test(Conc ~ Scenario, data = data.frame)
t1
```

However, this code doesn't apply the test for each site individually.

I have looked looked at all similar examples i could find (on SO and elsewhere) and this is the best code i could come up with:

```
t2 = data.frame %>% group_by(Site) %>% do(tidy(wilcox.test(Conc~Scenario, data=data.frame), na.rm=TRUE, equal.var=FALSE))
t2
```

this code is giving me an output for each site but all test outputs are the same, even the p value:

```
# A tibble: 107 x 5
# Groups: Site [107]
Site statistic p.value method alternative
<chr> <dbl> <dbl> <chr> <chr>
1 A 6145702 0.690 Wilcoxon rank sum test with continuity correction two.sided
2 B 6145702 0.690 Wilcoxon rank sum test with continuity correction two.sided
3 C 6145702 0.690 Wilcoxon rank sum test with continuity correction two.sided
4 D 6145702 0.690 Wilcoxon rank sum test with continuity correction two.sided
5 E 6145702 0.690 Wilcoxon rank sum test with continuity correction two.sided
6 F 6145702 0.690 Wilcoxon rank sum test with continuity correction two.sided
```

Can anyone see what I'm doing wrong? thanks for your help

Solution

**EDITED 21/08/2020 to more closely mirror your data**

Here's a solution with `dplyr`

and `purrr`

**EDITED to include broom::tidy results...**

```
# 'data.frame': 5626 obs. of 3 variables:
# $ Site.Year: Factor w/ 3 levels "Baffle Creek at Newton Road_2018_2019",..: 1 1 1 1 1 1 1 1 1 1 ...
# $ Scenario : chr "FF_Total" "FF_Total" "FF_Total" "FF_Total" ...
# $ PAF : num 4.77 4.77 4.77 4.77 4.77
set.seed(2020)
Site.Year <- rep(c("Baffle Creek at Newton Road_2018_2019",
"Baffle Creek at Newton Road_2017_2018",
"Baffle Creek at Newton Road_2019_2020"), 50)
Scenario <- rep_len(c(rep("FF_Total", 4), rep("Not_FF_Total", 4)), 150)
PAF <- rnorm(150, mean = 2.5, sd = 1)
DailyPAF_long <- data.frame(Site.Year, Scenario, PAF)
DailyPAF_long$Site.Year <- factor(DailyPAF_long$Site.Year)
# str(DailyPAF_long)
# wilcox.test(PAF ~ Scenario, data = DailyPAF_long)
library(dplyr)
library(purrr)
DailyPAF_long %>%
base::split(Site.Year) %>%
purrr::map(~ wilcox.test(PAF ~ Scenario, data = .)) %>%
purrr::map_dfr(~ broom::tidy(.))
#> # A tibble: 3 x 4
#> statistic p.value method alternative
#> <dbl> <dbl> <chr> <chr>
#> 1 361 0.355 Wilcoxon rank sum exact test two.sided
#> 2 219 0.0723 Wilcoxon rank sum exact test two.sided
#> 3 380 0.195 Wilcoxon rank sum exact test two.sided
```

- Adding labels to geom_col()
- Legend title in ggplot2
- How can I extract a value from a dataframe based on values within that dataframe?
- R list files with multiple conditions
- R - getting count of maximum-sized sub-group when summarising at prior group_by level
- Problems when running GDC_prepare in R
- Filtering files with names starting with a specific string
- Mutate a vector within a pipe chain
- How to sum a variable by group
- Using hex code to change text color in RMarkdown PDF (R)
- How to Remove Degree and Cardinal Direction Symbols from ggplot Coordinate Axes
- rstan and brms cause R and RStudio session abort
- How to change the plot background color generated by plot(effect(...)) in grey with white grid in R？
- SQL query on arrow duckdb workflow in R
- Venn diagram with duplicated elements
- R- Filter by time closest to midnight
- Difference between rlm() and lm_robust
- Is there a way to combine sorting an rhandsontable and removing from an rhandsontable?
- Split violin plot with ggplot2
- ggbarplot top of one bar does not align with its error bar
- read file from google drive
- Placing text into stacked bar charts in ggplot
- R - windowing data backwards from latest data in non-overlapping (not-rolling) periods and counting within periods
- Replacing list elements while preserving their attributes
- Asymmetric partial matching of text strings between two dataframes
- R: assign dataframe column values using external vectors
- is it possible to redirect console output to a variable?
- Package for category overlines on scatterplot in ggplot
- Identify connected subnetworks (R-igraph)
- Setting up the inoreader API in R