Search code examples
c#linqprojection

LINQ Query projection from multiple from statements


I'm creating a very large List of objects that are all possible combinations of Min, Max and Increment across more than one set of collections.

It works however, my problem is that I can't easily exclude one or more item collections without re-writing the LINQ query. The order of 'from' statements in the query is inferred from the order in the passed in 'loops' list. The query and final projection into the 'Step model' list in this example, expects three item sets to process. Although the 'from' statements are LINQ it still looks like repetitive code that would lend itself to a For loop. I can't figure out how I'd do this given the projection is combined within this single query. NOTE: I may add more collections compounding the problem further!

public static IEnumerable<StepModel> CreateSteps(List<MinToMax> loops)
{
    var steps = 
       from item1 in Enumerable
          .Repeat(loops[0].Minimum, (loops[0].Maximum - loops[0].Minimum) / loops[0].Increment + 1)
          .Select((tr, ti) => tr + loops[0].Increment * ti)

       from item2 in Enumerable
           .Repeat(loops[1].Minimum, (loops[1].Maximum - loops[1].Minimum) / loops[1].Increment + 1)
           .Select((tr, ti) => tr + loops[1].Increment * ti)

       from item3 in Enumerable
           .Repeat(loops[2].Minimum, (loops[2].Maximum - loops[2].Minimum) / loops[2].Increment + 1)
           .Select((tr, ti) => tr + loops[2].Increment * ti)

   select new StepModel
   {
       ItemValues1 = new Step { Value = item1, IsActive = true },
       ItemValues2 = new Step { Value = item2, IsActive = true },
       ItemValues3 = new Step { Value = item3, IsActive = true },
   };

   return steps;
}

public class MinToMax
{
   public int Minimum { get; set; }
   public int Maximum { get; set; }
   public int Increment { get; set; }
   public bool IsActive { get; set; } = true;
}

public class Step
{
   public int Value { get; set; }
   public bool IsActive { get; set; } = true;
}

public class StepModel
{
   public Step ItemValues1 { get; set; }
   public Step ItemValues2 { get; set; }
   public Step ItemValues3 { get; set; }
}

public class ItemSteps
{
    public MinToMax ItemStep1 { get; } = new MinToMax();
    public MinToMax ItemStep2 { get; } = new MinToMax();
    public MinToMax ItemStep3 { get; } = new MinToMax();
}

public static List<MinToMax> GetValuesIntoSteps()
{
    var list = new List<MinToMax>();
    var itemValues = new ItemSteps();

    itemValues.ItemStep1.Minimum = 10;
    itemValues.ItemStep1.Maximum = 100;
    itemValues.ItemStep1.Increment = 10;

    if (itemValues.ItemStep1.IsActive)
    {
        list.Add(itemValues.ItemStep1);
    }

    itemValues.ItemStep2.Minimum = 3;
    itemValues.ItemStep2.Maximum = 30;
    itemValues.ItemStep2.Increment = 3;

    if (itemValues.ItemStep2.IsActive)
    {
        list.Add(itemValues.ItemStep2);
    }

    itemValues.ItemStep3.Minimum = 15;
    itemValues.ItemStep3.Maximum = 75;
    itemValues.ItemStep3.Increment = 5;

    if (itemValues.ItemStep3.IsActive)
    {
        list.Add(itemValues.ItemStep3);
    }

    return list;
}

Solution

  • What you're trying to do here can be described at a high level as cross joins of an arbitrary number of sequences.

    When you do multiple from clauses in LINQ like that, that's a cross join. All but the first is a call to SelectMany. In order to support an arbitrary number of inputs, you'll need to loop through them. A simple foreach loop won't quite work, though, because the first sequence needs to be handled a bit differently.

    Differently how? Two things. First, there's nothing on which to call SelectMany. If you call it on an empty list, you get another empty list. So the first element in loops establishes the base list. The implementation I've provided below does start with an empty sequence, but it's only there in case there are no elements found in loops. It gets replaced when the first element is found. And second, at least in this implementation, you need to emit a new list from the first element, whereas the subsequent joins need to emit the next element and construct a new list.

    The other thing to keep in mind is that you need a result type that can handle an arbitrary number of elements. I've gone with IEnumerable<List<Step>> for this -- I chose List<Step> as the element type because each index of each list will correlate to that index of the loops parameter so you can index them both directly. For example, element [5] of each result will have been sourced from loops[5].

    To make that happen, I've written the equivalent of a foreach loop using IEnumerator<T> directly. For the first element in loops, it constructs a list of Step objects for each result you want to return.

    For each subsequent item in loops, it does a cross join using SelectMany and aggregates them into new lists containing the elements from the left side and each element from the right side.

    The enumerator exposes a Current property. This frees the iterations from being bound to a given index. You'll notice the current variable being used where loops[n] used to be.

    It's worth noting that the calls to ToList are necessary to force evaluation because the current variable is not captured by the lambdas the same way a range variable in a foreach loop would be.

    Here's how that looks:

    public static IEnumerable<List<Step>> CreateSteps(IEnumerable<MinToMax> loops)
    {
        IEnumerable<List<Step>> sequence = Enumerable.Empty<List<Step>>();
    
        using (IEnumerator<MinToMax> enumerator = loops.GetEnumerator())
        {
            if (enumerator.MoveNext())
            {
                MinToMax current = enumerator.Current;
    
                sequence = Enumerable
                    .Repeat(current.Minimum, (current.Maximum - current.Minimum) / current.Increment + 1)
                    .Select((tr, ti) => new List<Step>() { new Step() { Value = tr + current.Increment * ti, IsActive = true } })
                    .ToList();
    
                while (enumerator.MoveNext())
                {
                    current = enumerator.Current;
    
                    sequence = sequence
                        .SelectMany(
                            ctx => Enumerable
                                .Repeat(current.Minimum, (current.Maximum - current.Minimum) / current.Increment + 1)
                                .Select((tr, ti) => new Step() { Value = tr + current.Increment * ti, IsActive = true }),
                            (list, value) => new List<Step>(list) { value }
                            )
                        .ToList();
                }
            }
        }
    
        return sequence;
    }
    

    I also changed the loops parameter to IEnumerable<MinToMax> because nothing about how the method uses it requires it to be a list. On the other hand, I left the items in the returned sequence as Lists because it'll make it easier to correlate their elements to the source list, as I mentioned earlier.

    There's more refactoring that could be done, like extracting the expression passed to Enumerable.Repeat to a method so it can be reused.

    I didn't see how you were using it so I threw this together real quick. It showed the same results for both implementations with the three elements in loops you provided, and it appeared to show correct results for mine with both two and four inputs.

    static void Main(string[] args)
    {
        List<MinToMax> loops = GetValuesIntoSteps();
    
        foreach (List<Step> loop in CreateSteps(loops))
        {
            foreach (Step step in loop)
            {
                Console.Write($"{step.Value} ");
            }
            Console.WriteLine();
        }
    }
    

    I hope that helps.