Match regex with overlap for DNA

I am trying to match DNA sequences that begin at the beginning or a multiple of 3 letters from the beginning, and start with either ATG or CGA, followed by 6,9,12,15,... letters and ending in AGT. The following code only gets one of the matches (the longest one). I have looked into "positive lookaheads" (e.g. ?=) but could not figure out how to successfully apply it to this situation.

dna=c("ABCATGABCGAAADFAGTAAAAGTAGTAAAGT")
str_match_all(dna, "^(...)*((?:ATG|CGA)(?:...){2,}(?:AGT))")

[[1]]
     [,1]                          [,2]  [,3]      
[1,] "ABCATGABCGAAADFAGTAAAAGTAGT" "ABC" "ATGABCGAAADFAGTAAAAGTAGT"

Desired:
ABCATGABCGAAADFAGT ABC ATGABCGAAADFAGT
ABCATGABGCGAADFAGTAAAAGT ABC ATGABGCGAADFAGTAAAAGT
ABCATGABGCGAADFAGTAAAAGTAGT ABC ATGABGCGAADFAGTAAAAGTAGT

Solution

I know you're looking for a regex, but perhaps it's easier if you program it out:

Use a greedy regex .{3} to split the string into triplets.
Find the start and stop positions,
Create all possible combinations,
Filter the combinations that stop after they start and
Take the fragments of the original string

dna <- c("ABCATGABCGAAADFAGTAAAAGTAGTAAAGT")
triplets <- str_extract_all(dna, ".{3}")[[1]]
tidyr::expand_grid(
  start = which(triplets %in% c("ATG", "CGA")),
  stop = which(triplets == "AGT")
) %>%
  dplyr::filter(start < stop) %>%
  dplyr::mutate(fragment = stringr::str_sub(dna, 3*(start-1) + 1, 3*stop))

# A tibble: 3 x 3
  start  stop fragment                
  <int> <int> <chr>                   
1     2     6 ATGABCGAAADFAGT         
2     2     8 ATGABCGAAADFAGTAAAAGT   
3     2     9 ATGABCGAAADFAGTAAAAGTAGT