Search code examples
c#ssisscript-task

SSIS Script Task - Count Number of Unique Rows in Output Object


My SSIS script task generates 5 rows of records into an Output script component.

if (conditionMeets)
{
    WOProductBuffer.AddRow();
    WOProductBuffer.WorkOrderId = workOrderId;
    WOProductBuffer.WorkOrderProductId = workOrderProductId;
    //other fields
}

enter image description here

Objective: Count number of rows, group by WorkOrderId and WorkOrderProductId, and set this count value to WopCount

I realized that PostExecute() method is unable to read the Output object WOProduct, so it is likely not possible.

Based on all the rows, is there a way to implement this within the same script task?

Or the only way is to create a new script task, loop all the records in PreExecute() method to generate the count value?


What I have tried:

  1. Adding WOProductBuffer to a list, and loop in PostExecute() method. This didn't work as each row doesn't seem to contain any value

Currently trying:

  1. How to loop through Input rows in a new script task

Solution

  • From an earlier question, you have a class something like this

    public class WorkOrderProduct
    {
        public Guid workOrderId;
        public Guid workOrderProductId;
    }
    

    In a script task, you'll want to update an SSIS Variable with the final count so we know that can only take place in the PostExecute method.

    Similar to how we moved the declaration of wopList to the class level on How to Access Parameters of User Variables in SSIS Script Task we would create a similar list, just with different a type.

    There are two ways of doing this. You can either implement the distinct logic in your code and only add to the list unique items. The other option is to use a tiny bit of LINQ and let it do the logic for you.

    The decision points become:

    • What do I understand and want to maintain?
    • What's the expected cardinality relative to the size of the list - aka how many total rows would we expect versus how many uniques? If it's under millions, eh, I can't imagine it making a difference which you choose. Above a million, I'll start pulling out my consulting "It Depends" card but I suspect you'll be fine. Billions? yeah, I bet things will start to get interesting. If nothing else, you'll probably need LongCount instead of the Count method.

    Create a class member - pick one (or both and try them out)

    List<KeyValuePair<Guid, Guid>> option1;
    List<KeyValuePair<Guid, Guid>> option2;
    

    In your PreExecute method, instantiate the List(s)

    this.option1 = new List<KeyValuePair<Guid, Guid>>();
    this.option2 = new List<KeyValuePair<Guid, Guid>>();
    

    In your existing logic, as a final step, implement option 1 or 2. We will create a KeyValuePair of our two Guids.

    We will then ask the existing List if it has one of those already? If it does not, we'll add it to our option1 list. Finally, we'll just add it to option2 list as we'll figure out uniques later.

    if (conditionMeets)
    {
        // Doing our business process here thing
    
        KeyValuePair<Guid, Guid> newItem = new KeyValuePair<Guid, Guid>(workOrderId, workOrderProductId);
        
        if (!option1.Contains(newItem))
        {
            option1.Add(newItem);
        }
    
        // Just add it and we'll figure it out later
        option2.Add(newItem);
    }
    

    In your PostExecute method, you can use the Count property on the option1 List as you've already done the heavy lifting to only add distinct values.

    For option2, we'll invoke the Distinct Method and then chain a call to Count() method. Do note the difference in when we use parentheses here otherwise you'll have "Cannot convert from 'method group' to int" syntax error in your code.

    Finally, Console.WriteLine doesn't do you any good. Instead you'll assign values back to your Variable or as I showed in

    bool pbFireAgain = false;
    int uniqueCount = 0;
    
    // Option 1 approach
    uniqueCount = option1.Count;
    // Pop the value into the run log so we can trace what was generated
    this.ComponentMetaData.FireInformation(0, "SCR PostExecute Counts", string.Format("option1 count is {0}", option1.Count), "", 0, ref pbFireAgain);
    
    // Option 2 logic
    uniqueCount = option2.Distinct().Count();
    // Push the option 2 values into the output log
    this.ComponentMetaData.FireInformation(0, "SCR PostExecute Counts", string.Format("option2 Distinct Count is {0}, total Count is {1}", option2.Distinct().Count(), option2.Count()), "", 0, ref pbFireAgain);
    
    
    this.Variables.MySSISVariable = uniqueCount;