Search code examples
rstringsubsetstringrsapply

Subsetting string vector by word count in one line


I have a vector of strings

rownames
[1] "multifarmacias descuento" "multifarmacias"           "multifarmacias"

My goal is to subset rownames in one line by strings that only contain one word- the output would be

[1] "multifarmacias"           "multifarmacias"

I have tried the following but it throws an error:

rownames[which(sapply(strsplit(rownames, " "),length)) == 1]

Error in which(sapply(strsplit(rownames, " "), length)) : 
  argument to 'which' is not logical

Is there an elegant solution to subsetting a string vector by length of words the string?


Solution

  • It would be easier with str_count

    library(stringr)
    rownames[str_count(rownames, "\\w+") == 1]
    #[1] "multifarmacias" "multifarmacias"
    

    If we use strsplit with lengths (from base R) would be more efficient

    rownames[lengths(strsplit(rownames, "\\s+")) == 1]
    #[1] "multifarmacias" "multifarmacias"
    

    The error in OP's post is based on the wrong placement of ), It should be after the ==1 because which is applied directly on a vector of lengths i.e.

     which(c(2, 1, 1))
    

    Error in which(c(2, 1, 1)) : argument to 'which' is not logical

    data

    rownames <- c("multifarmacias descuento", "multifarmacias", "multifarmacias")