Search code examples
ssisderived

FINDSTRING multiple criteria for derived column in SSIS


I am new to SSIS and I am having an issue populating a derived column based on more than one criteria from strings in a column.

I have managed to get it working with a single criteria as an example:

FINDSTRING(OS,"Server",1) > 0 ? "Server" : "Desktop"

The above works and populates anything with server in the OS to "server" and everything else to "Desktop" but I have other strings that can identify a server. what I have tried as an example is:

FINDSTRING(OS,"Server", "Red Hat", "AIX",1) > 0 ? "Server" : "Desktop"

I basically have about 10 key words that id a server in the OS column so I want to output Server for these in the derived column and Desktop for anything without those strings.

Is that possible? I thought about doing 10 different find strings but I assumed the outputs would overwrite each other.

Thank you.


Solution

  • That's not valid syntax so no, what you're directly attempting won't work.

    If you have a constrained list of values aka a static list for your server values then I would take the approach of adding 10 Derived Columns to your data flow.

    They will take the following pattern of adding a new boolean (true/false) column to the data flow. You do not need to bring those values into your final table but you will use them to compute the final value of whether this is a desktop or server operating system.

    DER HPUX

    Add a derived column to the data flow and name it DER HPUX where the final word(s) are the server key word. You'll then add a new column to the using the FINDSTRING syntax and name it along the lines of your component name

    hasHPUX  FINDSTRING(LOWER([OS]), "hpux") > 0
    

    Note that I have explicitly cast the OS to lower case here and do the same with my argument to findstring as I don't know for certain whether Red Hat will always be Red Hat and not red hat or RED HAT. You know your data better but that may help if the data is inconsistently formatted.

    Repeat this pattern for all your keywords

    DER IsServer

    Here I'll create another intermediate column called IsServer and all I'm going to do is OR all the preceding has* named

    hasHPUX || hasRedHat || hasAix || etc
    

    DER Determine Server or Desktop

    Finally, we're ready to use the newly created column of isServer to populate the column OSClass or whatever you want it called

    (isServer) ? "Server" : "Desktop"
    

    Wow, that's a lot of work, why in the world would you do all of that?

    Testing.

    You can't debug an expression. By breaking all of the complex logic down into tiny nibbles, you can put a Data Viewer in the package after the DER isServer and at a glance determine why something is or isn't setting the appropriate flag value.

    Sure, you could do this in a single expression like the syntax you are trying but help your future self by not doing that. When your expression is so long you can't see all of it in the editor, it's time to break it into smaller units.

    (FINDSTRING(LOWER([OS]), "hpux") > 0 || FINDSTRING(LOWER([OS]), "red hat") > 0 || FINDSTRING(LOWER([OS]), "aix")> 0 || etc) ? "Server" : "Desktop"
    

    References