Finding the longest stretch of repeated words in a long string of characters

I have a long DNA sequence text file with characters (ATCG). I am looking for some method in R that can be used to find the longest stretch with repeated words. Lets say my string looks like, AAGTGCGGGTTCAGATCGCCCCCCCATCGGGCAAAAAAAAAAAAAAAATCGA

I need the output possibly with counts, AAAAAAAAAAAAAAAA n=16

Please help me with this.

Solution

if you have one string:

library(tidyverse)
string <- "AAGTGCGGGTTCAGATCGCCCCCCCATCGGGCAAAAAAAAAAAAAAAATCGA"

x <- str_extract_all(string, "(.)\\1+")
x[which.max(nchar(x))]

[1] "AAAAAAAAAAAAAAAA"

if you have many strings:

str_extract_all(c(string, string), "(.)\\1+")%>%
  map_chr(~.x[which.max(nchar(.x))])

[1] "AAAAAAAAAAAAAAAA" "AAAAAAAAAAAAAAAA"

To find the counts, just use nchar or even str_count of the result

Determining as to whether the function call is stored in a variable
How to increase resolution of ggplots without using ggsave in officer?
How to apply a function within a for loop for a set of existing variables of a predifined structure in R?
How to change the number of decimal places of the mean in a data frame
Forestplot in R superscript in tabletext matrix
Converting colnames from character to numeric when extracting them from a dataframe
With different seeds can Random States repeat in R
How to Stack Painted Phylogenetic Trees in R Like ggdensitree but with Colored Regimes
How to import/rbind multiple files with different indexed columns?
Pivot wider in R return list instead of data frame
R workspace doesn't load anything in Visual Studio Code
Search for multiple occurrences of substring within string
Looking for faster way to implement logSumExp across multidimensional array
How to get \bm{} to work in an R markdown (to HTML) file?
Leaving a space in geom_bar with stat='bin', position='dodge' and zeroes
Where are the 1st stage F statistics located in a "fixest" object?
Collapsing data per family
How to set default template for new ".R" files in rstudio
Adding two-way fixed effects for zero inflated Poisson model
How to Move R Code and Output to the Right in a Quarto PDF Document?
What does runif() mean when used inside if_else()?
Remove columns from dataframe where ALL values are NA
How to pass unquoted argument to filter() within user defined function
Optimizing the distribution of random integers across data frame rows
R dplyr conditional join with "join_by" not working
How to hide a reappearing hover label for a trace in R-Plotly within Shiny?
Estimating non-monotonic bi-exponential curve fit
Problem minimising a function using L-BFGS-B method in R?
trying to obtain TRUE/FALSE response to the question "is argument name x valid for function y?"
Retrieve whole lyrics from URL