Search code examples
rcountconditional-statementscharactercriteria

How to count the number of row conditional to multiple character criteria?


I have a data frame in R like this :

ID     Type
---------------------------
1      Green-Red-Red-Green
2      Pink-Blue-Red-Red
3      Green-Green-Red
4      Pink-Blue-Blue-Green
5      Red-Red-Red-Green

So, I want to count the number of row containing the words Green and Red but not Pink and Blue.

In this case, the number would be 3 (3 rows, indeed when ID = 1,3 and 5).

I don't find how I can do it with multiple criteria and with characters. How can I do that, please?


Solution

  • you can do

     `library(data.table)`  
    
     `dt <- as.data.table(data_frame) # transform your data frame to a data table 
      nrow(dt[(Type%like%"Green") & (Type%like%"Red" & !Type%like%"Pink") & 
     (Type%like%"Blue"),]) # & stands for AND, ! stands for NOT`
    

    UPDATE according to question in comment

    This will give you the number of characters between "Pink" and "Blue"

    string <- "Pink-Green-Blue-Red" tmp <- str_match(string, "Pink(.*?)Blue") nchar(tmp[,2]).

    So you can do

    dt[,tmp:=str_match(Type, "Pink(.*?)Blue")]
    nrow(dt[!is.na(tmp)])