I need to stage a couple of thousands of txt files, using SSIS, into an SQL Server 2012 instance.
The issue is that some of the txt files have control checksums in their first two lines and some others don't.
If a file has a checksum, its first line starts with "HEADER" - in such case I have to skip the first two lines then read column headers from line #3 and load data starting from line #4.
If a file does not have a checksum, column names will exist in line #1 and data starts at line #2.
Now, my question is: what is the least invasive way of dealing with such scenario? I am lazy so I am looking for minimum effort with maximum effects.
My ideas so far:
Detect existence of checksums with a C# scripting component and build two separate flows, one for each file type. Disadvantage: two almost identical flows (I am not a big fan of redundancy)
Use PowerShell first to get rid of checksums from files before running SSIS flow. Disadvantage: need to re-write plenty of files (performance hit)
Ask on StackOverflow. Disadvantage: can be seen as ironic by StackOverflow community.
Any hints?
Here's an approach.
Header Rows to Skip
as 0, and for the Flat File Connection Manager that has the Checksum rows, set it to skip the first 2 rows as shown below.
DelayValidation
property for both connection managers to true
.HasCheckSumRows
HasCheckSumRows
Flag set to true
, and create 2 outputs, 1 for Files With Checksum Rows, other for Files Without.connectionstring
for the flat file source. Hope this makes sense.