Search code examples
c#arrayslistduplicatescontiguous

Replace contiguous duplicates by a set value, but only if length of contiguous range more than a threshold


What would be an efficient way to replace contiguous identical values in a list elements by another given value, but only if the the contiguous sequence runs for more than a certain number of elements (eg : more or equals to 5)

Example:

["red"; "red"; "blue"; "green"; "green"; "red"; "red" ; "red"; "red"; "red"; "red"; "yellow"; "white"; "white"; "red"; "white"; "white"]

should become:

["red"; "red"; "blue"; "green"; "green"; "ignore"; "ignore" ; "ignore"; "ignore"; "ignore"; "ignore"; "yellow"; "white"; "white"; "red"; "white"; "white"]

Any idea?


Solution

  • As said in the comments, using GroupAdjacent to group contiguous duplicates using the nuget package MoreLinq is an option:

    var strings = new List<string> { "red", "red", "blue", "green", "green", "red", "red", "red", "red", "red", "red", "yellow", "white", "white", "red", "white", "white" };
    
    var result = strings
        .GroupAdjacent(x => x)
        .SelectMany(grp => (grp.Count() >= 5) ?
                    grp.Select(x => "ignore") : 
                    grp);
    
    Console.WriteLine("{ " + string.Join(", ", result) + " }");
    

    Result:

    { red, red, blue, green, green, ignore, ignore, ignore, ignore, ignore, ignore, yellow, white, white, red, white, white }
    

    The above also uses Enumerable.SelectMany to flatten the grouped IEnumerable<IEnumerable<string>> sequence into a IEnumerable<string>, and then a ternary operator to decide if the group should be completely replaced by "ignore" with Enumerable.Select if the group length from Enumerable.Countis greater or equal to 5, or left as is.