raggregatequantile# List values in all rows and columns based on all unique values in another column to find 80th percentile value for each unique value

I have a data frame of 100 rows and 20 column. Column 1 contains IDs which can be grouped. Column 2 to 20 contains numeric integer values. The aim is to find the 75 percentile value for each unique ID in column 1 considering all values in column 2 to 20.

What I have tried?

I have used the code in the following link using aggregate() function.

`aggregate(GO~GeneID,data=df,FUN = function(x) paste0(x,collapse = '; ')) `

How to list row values in a column based on grouping value in R?

However, you could use this code to make a list of values for one column at a time for all unique IDs in column 1. So the 80th percentile of the list of values when derived is only the 80th percentile for the values in that column. What I want is 80th percentile of all values in all columns for each unique ID in column 1.

Solution

A `data.table`

grouping operation with `.SD`

makes it easy.

```
library(data.table)
setDT(df)[,.(p75 = quantile(unlist(.SD), 0.75)), keyby = id]
#> id p75
#> 1: 1 797.00
#> 2: 2 786.00
#> 3: 3 753.75
#> 4: 4 724.50
#> 5: 5 793.75
#> 6: 6 781.75
#> 7: 7 808.00
#> 8: 8 788.50
#> 9: 9 761.00
#> 10: 10 784.00
```

Data:

```
set.seed(1935200967)
df <- cbind(
data.frame(id = sample(10, 100, 1)),
matrix(sample(1e3, 1900, 1), 100)
)
```

- Loop to pull every ID into dataframe from API in R
- Transforming a dataframe by creating a new column
- Using tryCatch to avoid erros in a double loop
- Plotly Radar Chart from ”circle” to polygon in R
- Replace 0 with NA in certain columns in R
- R: Yearmon producing NA values
- Creating a delta variable over time for different values of an ID
- tidyr::unnest() tibble, integer and vector
- Join lines in R plot based in order in data frame
- How to correctly install rpy2 in python?
- bslib theme not coloring page_navbar as expected
- How can I generate a random number with restrictions?
- subset of data.frame columns to maximize "complete" observations
- can't execute rsDriver (connection refused)
- Select the data.frame with maximum dimensions from a list of data.frames
- nTrials must be be greater.... issue on conjoint design
- error in `na_if()`: ! Can't convert `y` <character> to match type of `x` <double>
- Getting errors installing glmnet package using R studio
- R plotly show only labels in the stacked bar chart where percentage value for the stack is above 5
- Is there a way to add superscript and subscript to the same element using ggtext,<sub>&<sup>?
- ggplot labels with "K" for thousands or "M" for millions (maintaining 'comma' y-axis labeling)
- Default value in filter_select crosstalk doesn't work with JS
- Automated procedure to draw ray in ggplot
- group_by and pivot_wider combination
- lm(): What is qraux returned by QR decomposition in LINPACK / LAPACK
- How to apply different functions for different groups and mutate?
- R studio time column is between
- Optimizing nested ifelse in shiny
- Expanding dataset rows based on intervals in R
- Free scales for facets in facet_grid change pattern