I have a set of strings, each has a single character "X"
c("KGDDQSXQGGAPDAGQE", "TEEDSEEVXEQK", "LTXTSGETTQTHTEPTGDSK", "IXTHNSEVEEDDMDK", "SXENPEEDEDQRNPAK", "XTAEHEAAQQDLQSK", "ATVIXHGETLRRTK", "XAVAREESGKPGAHVTVK", "YHTINGHNAEVXK", "XAAEDDEDDDVDTK")
I would like to get a character vector with each element having 11 characters, the center of the string is "X" and there is 5 characters from the string on each side. If there are no 5 characters on one of the sides, then "x" is added instead.
E.g.
"KGDDQSXQGGAPDAGQE", becomes "GDDQSXQGGAP"
"TEEDSEEVXEQK", becomes "DSEEVXEQKxx"
"LTXTSGETTQTHTEPTGDSK", becomes "xxxLTXTSGET"
One more approach, using stringr:
library(stringr)
vec <- c("KGDDQSXQGGAPDAGQE", "TEEDSEEVXEQK", "LTXTSGETTQTHTEPTGDSK", "IXTHNSEVEEDDMDK", "SXENPEEDEDQRNPAK", "XTAEHEAAQQDLQSK", "ATVIXHGETLRRTK", "XAVAREESGKPGAHVTVK", "YHTINGHNAEVXK", "XAAEDDEDDDVDTK")
vec %>%
str_pad(width = sapply(vec, nchar) + 10,
side = "both", pad = "x") %>%
str_match(".{5}X.{5}")
#> [,1]
#> [1,] "GDDQSXQGGAP"
#> [2,] "DSEEVXEQKxx"
#> [3,] "xxxLTXTSGET"
#> [4,] "xxxxIXTHNSE"
#> [5,] "xxxxSXENPEE"
#> [6,] "xxxxxXTAEHE"
#> [7,] "xATVIXHGETL"
#> [8,] "xxxxxXAVARE"
#> [9,] "HNAEVXKxxxx"
#> [10,] "xxxxxXAAEDD"
Created on 2020-04-26 by the reprex package (v0.3.0)