Search code examples
c#stringlinqienumerable

String Split by Length and split only by nearest space


I am having a Text Like

var data = "âô¢¬ôè÷¢ : ªîø¢è¤ô¢ - ã¿ñ¬ô ñèù¢ ªð¼ñ£÷¢ ï¤ôñ¢,«ñø¢è¤ô¢ - ªð¼ñ£÷¢ ñèù¢ ÝÁºèñ¢ ï¤ô袰ñ¢ ñ¤ì¢ì£ Üò¢òñ¢ ªð¼ñ£ñ¢ð좮 è¤ó£ñ âô¢¬ô袰ñ¢,õìè¢è¤ô¢ - ÝÁºèñ¢ ï¤ôñ¢,è¤öè¢è¤ô¢ - ô좲ñ¤ ï¤ôñ¢ ñø¢Áñ¢ 1,22 ªê ï¤ôñ¢ ð£î¢î¤òñ¢";

and I am Having the Extension Method to split string

public static IEnumerable<string> EnumByLength(this string s, int length)
{
    for (int i = 0; i < s.Length; i += length)
    {
        if (i + length <= s.Length)
        {
            yield return s.Substring(i, length);
        }
        else
        {
            yield return s.Substring(i);
        }
    }
}
public static string[] SplitByLength(this string s, int maxLen)
{
    var v = EnumByLength(s, maxLen);
    if (v == null)
        return new string[] { s };
    else
        return s.EnumByLength(maxLen).ToArray();
}

Now my question is

To split this string by Maximum Length 150 and the splitting must be done only by the Nearest Spaces in it..(either before 150 or after 150.. not in the middle of a word.

How?


Solution

  • My version:

    // Enumerate by nearest space
    // Split String value by closest to length spaces
    // e.g. for length = 3 
    // "abcd efghihjkl m n p qrstsf" -> "abcd", "efghihjkl", "m n", "p", "qrstsf" 
    public static IEnumerable<String> EnumByNearestSpace(this String value, int length) {
      if (String.IsNullOrEmpty(value))
        yield break;
    
      int bestDelta = int.MaxValue;
      int bestSplit = -1;
    
      int from = 0;
    
      for (int i = 0; i < value.Length; ++i) {
        var Ch = value[i];
    
        if (Ch != ' ')
          continue;
    
        int size = (i - from);
        int delta = (size - length > 0) ? size - length : length - size;
    
        if ((bestSplit < 0) || (delta < bestDelta)) {
          bestSplit = i;
          bestDelta = delta;
        }
        else {
          yield return value.Substring(from, bestSplit - from);
    
          i = bestSplit;
    
          from = i + 1;
          bestSplit = -1;
          bestDelta = int.MaxValue;
        }
      }
    
      // String's tail
      if (from < value.Length) {
        if (bestSplit >= 0) {
          if (bestDelta < value.Length - from)
            yield return value.Substring(from, bestSplit - from);
    
          from = bestSplit + 1;
        }
    
        if (from < value.Length)
          yield return value.Substring(from);
      }
    }
    
    ...
    
    var list = data.EnumByNearestSpace(150).ToList();