I would like to split a character by spaces but keep the spaces inside the quotes (and the quotes themselves). The problem is, the quotes can be nested, and also I would need to do this for both single and double quotes. So, from the line this "'"is a possible option"'" and ""so is this"" and '''this one too''' and even ""mismatched quotes"
I would like to get [this, "'"is a possible option"'", and, ""so is this"", and, '''this one too''', and, even, ""mismatched quotes"]
.
This question has already been asked, but not the exact question that I'm asking. Here are several solutions: one uses a matcher (in this case """x"""
would be split into [""", x"""]
, so this is not what I need) and Apache Commons (which works with """x"""
but not with ""x""
, since it takes the first two double quotes and leaves the last two with x
). There are also suggestions of writing a function to do so manually, but this would be the last resort.
You can achieve that with the following regex: ["']+[^"']+?["']+
. Using that pattern you retrieve the indices where you want to split like this:
val indices = Regex(pattern).findAll(this).map{ listOf(it.range.start, it.range.endInclusive) }.flatten().toMutableList()
The rest is building the list out of substrings. Here the complete function:
fun String.splitByPattern(pattern: String): List<String> {
val indices = Regex(pattern).findAll(this).map{ listOf(it.range.start, it.range.endInclusive) }.flatten().toMutableList()
var lastIndex = 0
return indices.mapIndexed { i, ele ->
val end = if(i % 2 == 0) ele else ele + 1 // magic
substring(lastIndex, end).apply {
lastIndex = end
}
}
}
Usage:
val str = """
this "'"is a possible option"'" and ""so is this"" and '''this one too''' and even ""mismatched quotes"
""".trim()
println(str.splitByPattern("""["']+[^"']+?["']+"""))
Output:
[this , "'"is a possible option"'", and , ""so is this"", and , '''this one too''', and even , ""mismatched quotes"]
Try it out on Kotlin's playground!