Search code examples
c#stringsplitstringbuilder

Best way to split string into lines with maximum length, without breaking words


I want to break a string up into lines of a specified maximum length, without splitting any words, if possible (if there is a word that exceeds the maximum line length, then it will have to be split).

As always, I am acutely aware that strings are immutable and that one should preferably use the StringBuilder class. I have seen examples where the string is split into words and the lines are then built up using the StringBuilder class, but the code below seems "neater" to me.

I mentioned "best" in the description and not "most efficient" as I am also interested in the "eloquence" of the code. The strings will never be huge, generally splitting into 2 or three lines, and it won't be happening for thousands of lines.

Is the following code really bad?

private static IEnumerable<string> SplitToLines(string stringToSplit, int maximumLineLength)
{
    stringToSplit = stringToSplit.Trim();
    var lines = new List<string>();

    while (stringToSplit.Length > 0)
    {
        if (stringToSplit.Length <= maximumLineLength)
        {
            lines.Add(stringToSplit);
            break;
        }

        var indexOfLastSpaceInLine = stringToSplit.Substring(0, maximumLineLength).LastIndexOf(' ');
        lines.Add(stringToSplit.Substring(0, indexOfLastSpaceInLine >= 0 ? indexOfLastSpaceInLine : maximumLineLength).Trim());
        stringToSplit = stringToSplit.Substring(indexOfLastSpaceInLine >= 0 ? indexOfLastSpaceInLine + 1 : maximumLineLength);
    }

    return lines.ToArray();
}

Solution

  • How about this as a solution:

    IEnumerable<string> SplitToLines(string stringToSplit, int maximumLineLength)
    {
        var words = stringToSplit.Split(' ').Concat(new [] { "" });
        return
            words
                .Skip(1)
                .Aggregate(
                    words.Take(1).ToList(),
                    (a, w) =>
                    {
                        var last = a.Last();
                        while (last.Length > maximumLineLength)
                        {
                            a[a.Count() - 1] = last.Substring(0, maximumLineLength);
                            last = last.Substring(maximumLineLength);
                            a.Add(last);
                        }
                        var test = last + " " + w;
                        if (test.Length > maximumLineLength)
                        {
                            a.Add(w);
                        }
                        else
                        {
                            a[a.Count() - 1] = test;
                        }
                        return a;
                    });
    }
    

    I reworked this as prefer this:

    IEnumerable<string> SplitToLines(string stringToSplit, int maximumLineLength)
    {
        var words = stringToSplit.Split(' ');
        var line = words.First();
        foreach (var word in words.Skip(1))
        {
            var test = $"{line} {word}";
            if (test.Length > maximumLineLength)
            {
                yield return line;
                line = word;
            }
            else
            {
                line = test;
            }
        }
        yield return line;
    }