Search code examples
sql-serverimportflat-file

Import Flat File into SQL Server Repeated Headers Break Integrity Constraints


I'm having several issues with importing a flat file into MS SQL Server using the SQL Server import / export wizard. I'd like to know how to effectively load the file into a SQL Server table.

File Conditions:

  1. The flat file is fairly large (800MB, and serveral million rows)
  2. It's poorly formatted
    1. The first column is empty
    2. The header is a 3 row set: top blank, middle has field names, bottom blank
    3. This 3 row header is repeated approximately every 60,000 rows
    4. Some values are nulls
    5. It's tab delimited

First, I tried to load it in as Flat File, but SQL server failed to recognize the tab delimiters. Excel opens it correctly (although partially), but SQL Server sticks it all in 1 column.

Second, I tried opening and saving it as an excel file and loading it as an excel file into the SQL Server import wizard (which I'm not sure if it resaves all the data anyway). Now SQL Server parses the columns correctly, but it says integrity constrints are broken when it hits the repeated headers (every numeric type field has a string header every 60000 rows).

If anyone can tell me how to get around this that would be great. I'd ideally like to upload it without the integrity constraints and remove the extra headers with a DELETE WHERE header or blank clause. Not the only solution I'll take, but an idea.

Also, this is my first stackoverflow post, so patience is appreciated.

Thanks,


Solution

  • Since I don't have a formal answer yet, I'll post what I ended up doing.

    Essentially, I just made everything a varchar so it would just load into a table. Then I wrote several queries to clean up the garbage in it. Later I made new typed fields and filled them with an insert and cast from the varchar typed fields.

    I don't know that this will ever help someone, but at least there's an answer here.