Search code examples
rdataframerbind

How to add a row to data frame (after conversion) in R?


I am trying to add a row to my data frame after I order it but I keep running into problems.

This is what I have tried:

colnames(col_freq) <- c("Symptoms", "values")
col_freq <- col_freq[order(-col_freq$values),]
top_freq <- rbind(col_freq[1:10,], c("Others", sum(col_freq[10:nrow(col_freq),2])))

The above code however results in the following data frame with a missing value

enter image description here

How do if add this row (c("Others", sum(col_freq[10:nrow(col_freq),2]))) to my data frame? Thank you in advance.


Solution

  • We can use rbind with list instead of a vector as list can have different types

    rbind(col_freq[1:10,], list("Others", sum(col_freq[1:nrow(col_freq),2])))
    

    With c, i.e. creating a vector, if there is a single non-numeric element, it would change the class of the whole vector to that type based on the order of precedence for different types. character would have more precedence, so the sum output would be character


    In the OP's code, the missing value would be generated if the first column is factor. So, we need to convert that column to character or add a new level 'Others' before doing the rbind

    col_freq$Symptoms <- as.character(col_freq$Symptoms)
    

    Or if it needs to remain as factor

    col_freq$Symptoms <- factor(col_freq$Symptoms, levels = c(levels(col_freq$Symptoms), "Others"))
    

    Or assign new level

    levels(col_freq$Symptoms) <- c(levels(col_freq$Symptoms), "Others")
    

    Now, we do the rbind

    rbind(col_freq[1:10,], list("Others", sum(col_freq[1:nrow(col_freq),2])))
    #   Symptoms values
    #1       fat     40
    #2     sleep     25
    #3     irrit     21
    #4       rem     19
    #5      Neck     18
    #6      conc     12
    #7      nerv     11
    #8      Pres      9
    #9      slow      9
    #10     Head      8
    #11   Others     172
    

    data

    col_freq <- data.frame(Symptoms = c("fat", "sleep", "irrit", "rem","Neck", "conc",
        "nerv", "Pres", "slow", "Head"), values = c(40, 25, 21, 19, 18, 12, 11, 9, 9, 8))