I used a survey that has 20 questions and I calculated the mean of the 20 questions as "Total" for 400 participants. Now I need to categorise the Total into 4 groups: Total < 2 is limited, Total >= 2 is basic, Total < 3 is Good, and Total >= 3 is Full
I was able to create three but not four variables as follow:
level <- ifelse (df$Total <2, "Limited", ifelse((df$Total>= 2) & (df$Total<3), "Basic","Good"))
Then I want to see what's the percentage of each category either on numbers or graphs.
I may be misunderstanding something, but you appear to have overlapping categories- Total >= 2 is basic, but Total < 3 is good. You may want to confirm the bounds for your groupings. Once that's sorted, you were actually pretty close to a working solution- you can nest ifelse
statements and consider that they are evaluated in order. So, if a condition evaluates to TRUE
"early" in the chain, it will return whatever is the output for a TRUE
response at that point. Otherwise, it will move to the next ifelse
to evaluate. Note here that I've used 1, 2, and 3 as the 'breaks' for the categories, so that the logic evaluates to: "If it's less than 1, it's Limited. If it's less than 2, it's Basic. If it's less than 3, it's good. Otherwise, it's Full."
set.seed(123)
df <- data.frame(total = runif(n = 15, min = 0, max = 4))
df
df$level = ifelse(df$total < 1, "Limited",
ifelse(df$total < 2, "Basic",
ifelse(df$total < 3, "Good", "Full")))
> df
total level
1 0.5691772 Limited
2 2.1971386 Good
3 3.8163650 Full
4 2.3419334 Good
5 1.6180411 Basic
6 2.5915739 Good
7 1.2792825 Basic
8 1.2308800 Basic
9 0.8790705 Limited
10 1.4779555 Basic
11 3.9368768 Full
12 0.6168092 Limited
13 0.3641760 Limited
14 0.5676276 Limited
15 2.7600284 Good
With just four categories an ifelse
block is probably fine- if I were using many more bounds I'd likely use a different approach Edit: like thelatemail's- it's far cleaner.