I am working with SSIS and I need to load multiple files with the following (Yellos) format to SQL using SSIS
The problem as you can see is that the files has an horrible format only process / consume records if the column A is populated (e.g: ignoring rows# 14 - X ) and I need to insert the value in D1 into the Date column.
any suggestion?
Regards!
Lets divide this problem into 3 Sub problems:
D1
1. Get the date value from D1
@[User::FilePath]
(of type string) that contains the excel file path, @[User::FileDate]
(of type string) that we will use it to store the date value@[User::FilePath]
as a ReadOnly variable and @[User::FileDate]
as a ReadWrite variable@[User::FileDate]
This will search for the sheet named Refunds
and extract the date value from it and store this value into @[User::FileDate]
m_strExcelPath = Dts.Variables.Item("FilePath").Value.ToString
Dim strSheetname As String = String.Empty
Dim strDate as String = String.Empty
m_strExcelConnectionString = Me.BuildConnectionString()
Try
Using OleDBCon As New OleDbConnection(m_strExcelConnectionString)
If OleDBCon.State <> ConnectionState.Open Then
OleDBCon.Open()
End If
'Get all WorkSheets
m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,
New Object() {Nothing, Nothing, Nothing, "TABLE"})
'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones
For Each schRow As DataRow In m_dtschemaTable.Rows
strSheetname = schRow("TABLE_NAME").ToString
If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then
If Not strSheetname.Tolower.Contains("refunds") Then Continue For
Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "A1:D1]", OleDBCon)
Dim dtTable As New DataTable("Table1")
cmd.CommandType = CommandType.Text
Using daGetDataFromSheet As New OleDbDataAdapter(cmd)
daGetDataFromSheet.Fill(dtTable)
'Get Value from column 4 (3 because it is a zero-based index
strDate = dtTable.Rows(0).Item(3).ToString
End Using
End Using
'when the first correct sheet is found there is no need to check others
Exit For
End If
Next
OleDBCon.Close()
End Using
Catch ex As Exception
Throw New Exception(ex.Message, ex)
End Tr
Dts.Variables.Item("FileDate").Value = strDate
Dts.TaskResult = ScriptResults.Success
End Sub
In the DataFlow Task add a Derived Column Transformation
, add a derived column with the following expression
@[User::FileDate]
2. Start Reading from Row Number 4
As we assumed that the Excel File Path is stored in @[User::FilePath]
Excel Connection Manager
and uncheck the box First row has column names
SQL Command
SELECT * FROM [Refunds$A4:D]
, so it will start reading from the row number 43. Ignore all Rows Where Column1 is NULL
Split the Flow based on the following expression
ISNULL([F1]) == False
If you didn't give an alias to F1
otherwise use the alias
Finally, remember that you must add a derived column (as we said in the first sub-problem) that contains the date value