I have a data.frame (DL) and one of the column name is fruit and it is like c("apple", "lemon", "orange", "others") so I want to change level this column so that the order of legend (when I create plot) will follow the order that I want. Here is my code
DL$fruit <- factor(DL$fruit, levels=c("lemon", "apple", "orange", "others"))
But when I view this data after this using View(DL), the "others" will change to "NA". When I ggplot this and it will not show bar of "others". Does anyone have an idea what is going on and how to fix it? Thanks.
This sometimes happens if your data are not quite clean--for example, if you have extra whitespace around the input values.
Here's an example:
fruit <- c("apple", "lemon", "orange", "others", "others ") ## note the last two values
factor(fruit, levels=c("lemon", "apple", "orange", "others"))
# [1] apple lemon orange others <NA>
# Levels: lemon apple orange others
Now, let's strip out the whitespace:
newFruit <- gsub("^\\s+|\\s+$", "", fruit)
factor(newFruit, levels = unique(newFruit))
# [1] apple lemon orange others others
# Levels: apple lemon orange others
If you want to inspect the source data and look for whitespace, sometimes it helps to use print
, with quote = TRUE
:
print(fruit, quote = TRUE)
# [1] "apple" "lemon" "orange" "others" "others "
Alternatively, grepl
could also be of use:
grepl("^\\s+|\\s+$", fruit)
# [1] FALSE FALSE FALSE FALSE TRUE