algorithmd

Splitting a string in d programming language via whitespace where multiple whitespace can appear consecutively but should be treated as one


I want to split a string in D programming language, so that elements that are empty strings, are not counted.

Example:

Input : This is a string [Note that between is and a, there are 3 blank spaces]

Output: [This, is, a, string]

Problem

If I use the std.array.split [linked here] function with " " (blank space) as delimiter, then I get: ["This", "is", " ", "a", "string"]. See the blank element between "is" and "a".

My Current solution

output = input.split(" ").filter!(l => !l.strip().empty).array;

Note, that this would also be possible if multiple consecutive blank spaces were treated as one.

My question

Does the split function (or an alternative) has a built in method to either:

  • automatically treat multiple consecutive occurrence of the delimiter as one,
  • automatically reject elements that are whitespace only

Either of these two will be sufficient for this particular example (I can't think of a counterexample)

I looked at Programming in D – Tutorial and Reference by Ali Çehreli [here], but i can't seem to find this functionality. Does that mean in D you are supposed to use the filter and Lambda?

Thank you for your help.


Solution

  • split without any parameters does exactly what you are requesting:

    When no delimiter is provided, strings are split into an array of words,
    using whitespace as delimiter. Runs of whitespace are merged together
    (no empty words are produced).
    

    (this is a quote from the documentation which you linked to)