rquantilepercentile# How to break my sample into uneven categories?

We usually use `quartiles`

,`quantiles`

, or `ntiles`

to split a sample. We can also use the function `cut`

.

I have a numeric variable where i would like to split my sample into three categories. But these should not be evenly spaced. For example, the `quartile`

function would split it to four evenly spaced quartiles. These are 0 to 25, 26 to 50, 51 to 75, and 76 to 100 percentiles. Therefore, the first three functions i mentioned cannot do the job. We can probably split the variable using `cut`

, but I don't know how to do it in terms of percentile. **I would like to create a variable that split the sample from the bottom 0 to the 20th percentile, then from 21 to 60, then from 61 to 100.**

Here is a reproducible code:

```
library(dplyr)
set.seed(1)
df <- tibble(
V1 = round(runif(1000,min=1, max=1000)),
V2 = round(runif(1000, min=1, max=3)),
V3 = round(runif(1000, min=1, max=10)))
df$V2 = as.factor(df$V2)
df$V3 = as.factor(df$V3)
df=df %>% group_by(V2,V3) %>%
mutate(quartile = ntile(V1,4))
```

Solution

I'm not 100% sure if this is what you're looking for, and I'll admit it's not the most elegant code ever written, but would something like:

```
cut.20 <- 20/100*length(df$V1)
cut.60 <- 60/100*length(df$V1)
#define your percentile limits (this is just based on googling how to calculate percentiles)
df <- arrange(df, V1) %>%
mutate("index" = c(1:nrow(df))) %>%
group_by(V2, V3) %>%
mutate("centile" = case_when(index > 0 & index <= cut.20 ~ "0-20",
index > cut.20 & index <= cut.60 ~ "21-60",
index > cut.60 ~ "60-100"))
```

do what you're looking for?

