Search code examples
sql-servervb.netssisetlssis-2012

Parse one field into several, based on delimiter


I am trying to figure out a way to parse 1 field into 8 different fields using SSIS. I'm open to a C# solution or a VB solution. I came up with a way to do it in SQL Server, but because my team is doing a lot of things in SSIS now, I'd like to have a SSIS solution to keep things streamlined and efficient. I looked online and found one option, in the link below.

Split a single column of data with comma delimiters into multiple columns in SSIS

I'm not really sure how that works, if it even works. My sample data looks like this:

Purchase | 345 | USD | GT | TF456577 | DG | 125 | KTMDC

I can have 0 to 7 pipe characters, so I need to split one field into a max of eight fields.

enter image description here

Update

Hadi, I am making a couple small edits. Can you look at the screenshot and tell me what's wrong here? Also, I never found the DT-STR option. Maybe that's the problem.


Solution

  • You can do this using a script component

    In this answer, I will assume that the data source is an SQL Server Table, with one column Column0

    In the Integration Service Package, follow these steps:

    1. Add a Data Flow Task

    2. Create an OLEDB Connection manager, and configure it

    3. In the Data Flow Task add an OLEDB Source and choose the Table that contains the Column you need to split

    4. Add a Script Component (Set its type to Transformation)

    5. Connect the OLEDB Source to the script component

    6. Change that Script Language to Visual Basic

    7. In the Script Component Select Column0 as Input

    8. Go To the Inputs Outputs Tab

    9. Add 8 Columns to Output0 (ex: outColumn0, outColumn1 ... outColumn8) with DT-STR data type

    10. In the Script window write the following script:

      Imports System.Linq
      
      Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
      
          If Not Row.Column0_IsNull AndAlso
              Not String.IsNullOrEmpty(Row.Column0.Trim) Then
      
              Dim strColumn As String = Row.Column0
              Dim intVerticalBarCount As Integer
      
              intVerticalBarCount = strColumn.Count(Function(x) x = CChar("|"))
      
              If intVerticalBarCount <= 8 Then strColumn = strColumn & "".PadRight((8 - intVerticalBarCount), CChar("|"))
      
              Dim strOutputColumns As String() = strColumn.Split(CChar("|"))
      
              Row.outColumn0 = strOutputColumns(0)
              Row.outColumn1 = strOutputColumns(1)
              Row.outColumn2 = strOutputColumns(2)
              Row.outColumn3 = strOutputColumns(3)
              Row.outColumn4 = strOutputColumns(4)
              Row.outColumn5 = strOutputColumns(5)
              Row.outColumn6 = strOutputColumns(6)
              Row.outColumn7 = strOutputColumns(7)
      
      
          Else
      
              Row.outColumn0_IsNull = True
              Row.outColumn1_IsNull = True
              Row.outColumn2_IsNull = True
              Row.outColumn3_IsNull = True
              Row.outColumn4_IsNull = True
              Row.outColumn5_IsNull = True
              Row.outColumn6_IsNull = True
              Row.outColumn7_IsNull = True
      
      
      
          End If
      
      
      End Sub
      

    In the code above, first we get the number of occurrence of vertical bars | in the column; if it is less than 8, we add the missing vertical bars, then we split the column and assign it to the output columns.