Search code examples
rconstraintscombinatorics

Filtering the Results of Expand.Grid


I am trying to generate a list of all combinations numbers that satisfy all the following conditions:

  • Any combination is exactly 6 numbers long
  • The possible numbers are only 1,5,7
  • 1 can only be followed by either 1 or 5
  • 5 can only be followed by either 5 or 7
  • 7 can only be followed by 7
  • There must be at least two 1's

I tried to do this with the expand.grid function.

Step 1: First, I generated a list of all 6 length combinations with 1,5,7:

numbers <- c(1, 5, 7)
all_combinations <- data.frame(expand.grid(rep(list(numbers), 6)))

Step 2: Then, I tried to add variables to satisfy the conditions:

all_combinations$starts_with_1 <- ifelse(all_combinations$Var1 == 1, "yes", "no")
all_combinations$numbers_ascending  <- apply(all_combinations, 1, function(x) all(diff(as.numeric(x)) >= 0))


all_combinations$numbers_ascending  <- ifelse(all_combinations$numbers_ascending , "yes", "no")


all_combinations$at_least_two_ones <- apply(all_combinations, 1, function(x) sum(x == 1) >= 2)

all_combinations$at_least_two_ones <- ifelse(all_combinations$at_least_two_ones, "yes", "no")

Step 3: Finally, I tried to keep rows that satisfy all 3 conditions:

all_combinations <- all_combinations[all_combinations$starts_with_1 == "yes" & all_combinations$numbers_ascending == "yes" & all_combinations$at_least_two_ones == "yes", ]

all_combinations

However, the results are all NA:

      Var1 Var2 Var3 Var4 Var5 Var6 starts_with_1 numbers_ascending at_least_two_ones
NA      NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.1    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.2    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.3    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.4    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.5    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.6    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.7    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.8    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.9    NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.10   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.11   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.12   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.13   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.14   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.15   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.16   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.17   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.18   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.19   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>
NA.20   NA   NA   NA   NA   NA   NA          <NA>              <NA>              <NA>

Note: I am trying to do this in a flexible way so that if I need to change something (e.g. modify to at least three 1's, or modify to 7 appearing before 5), I can quickly create a variable to test for this condition. This is why I am using the expand.grid approach.


Solution

  • I guess we'll be adjusting it, but how about a regex approach?
    Check this out:

    library(tidyverse)
    
    # ----------------
    my_numbers <- c(1, 5, 7)
    my_combinations <- data.frame(expand.grid(rep(list(my_numbers), 6)))
    
    # Patterns
    looking <- str_c(
      sep = "|",
      "1{2}")      # At least two "1"
    
    not_looking <- str_c(
      sep = "|",
      "17",        # 1 can only be followed by either 1 or 5
      "51",        # 5 can only be followed by either 5 or 7
      "71", "75")  # 7 can only be followed by 7
    
    # ----------------
    my_output <- my_combinations %>% 
      rowwise() %>% 
      mutate(combo = str_flatten(c_across(starts_with("var")))) %>% 
      filter(str_detect(combo, looking), !str_detect(combo, not_looking))
    

    The output:

    > my_output
    # A tibble: 11 × 7
    # Rowwise: 
        Var1  Var2  Var3  Var4  Var5  Var6 combo 
       <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> 
     1     1     1     1     1     1     1 111111
     2     1     1     1     1     1     5 111115
     3     1     1     1     1     5     5 111155
     4     1     1     1     5     5     5 111555
     5     1     1     5     5     5     5 115555
     6     1     1     1     1     5     7 111157
     7     1     1     1     5     5     7 111557
     8     1     1     5     5     5     7 115557
     9     1     1     1     5     7     7 111577
    10     1     1     5     5     7     7 115577
    11     1     1     5     7     7     7 115777
    

    Created on 2024-05-01 with reprex v2.1.0