Search code examples
rr-rownames

How can I delete the rownames which contain specific text in R?


I would like to remove the rownames ONLY of the rows which contain "Apple_"

df <- data.frame('fruit'=c("Apple_1", "Apple_2", "Apple_3", "Pear_1", "Pear_2", "Pear_3"),
                'color'=c("Red", "Red", "Green","Green","Green","Green"))
df<- column_to_rownames(df, var="fruit")

None of these work because I belive there aren't any rows called just "Apple"

row.names.remove <- c("Apple")
df[!(row.names(df) %in% row.names.remove), ]
df2<- length(which(grepl("Apple", rownames(df))))

df3<- df[row.names(df) != "Apple", , drop = FALSE]

Solution

  • We can use grep with -

    df[-grep('Apple', row.names(df)),, drop = FALSE]
    

    Or invert = TRUE

    df[grep('Apple', row.names(df), invert = TRUE),, drop = FALSE]
    

    With data.frame, the rownames and column names attributes cannot be empty. An option is to convert it to numeric index

    i1 <- grep('Apple', row.names(df))
    row.names(df)[i1] <- seq_along(i1)
    

    Or convert to a matrix and then change those row names to blank ("")

    m1 <- as.matrix(df)
    row.names(m1)[i1] <- ""
    

    as matrix allows duplicated rownames while data.frame doesn't. It is also possible to completely remove the rowname attribute, but it has to be across the whole object