Search code examples
ssisimport-from-excel

SSIS from Excel to SQL Server : DataType length


I 've got an SSIS Package (SQL Server 2008). I have an Excel source file (XLS 97-2003) that I want to import first to a SQL table storing everything as string (numbers and dates are stored as they rae written for instance). Then, I take data from this table to my other tables.

Excel source is configured like this : Provider=Microsoft.Jet.OLEDB.4.0;Data Source=*********;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";

My problem occurs at the first step. Let me explain : some of my columns MIGHT contain large text. I know exactly what those columns are.

The problem is that :

  • On one hand, if source columns are configured to be ntext, and if there is long text (>255 char), then OK. If there is no data is these columns, or short text (<255 char), or even long text after the first 8 rows, I get an error (Red box on Excel source... does not go further).
  • On the other hand, if source columns are configured to be (wstr, 255) and if there is no data, or short data (<255 char), everything is fine. If there is large text, I get an error (which seems logical).

I would like to configure my package so that it does not fall in error if the data source contains smaller data than expected. It seems to me that it is quite reasonnable, but I cannot achieve that...

Any help will be much appreciated.


Solution

  • According to MSDN SSIS documentation, you should read these two:

    Missing values. The Excel driver reads a certain number of rows (by default, 8 rows) in the specified source to guess at the data type of each column... For more information, see PRB: Excel Values Returned as NULL Using DAO OpenRecordset.

    Truncated text. When the driver determines that an Excel column contains text data, the driver selects the data type (string or memo) based on the longest value that it samples. If the driver does not discover any values longer than 255 characters in the rows that it samples, it treats the column as a 255-character string column instead of a memo column. Therefore, values longer than 255 characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key. For more information, see PRB: Transfer of Data from Jet 4.0 OLEDB Source Fails w/ Error.

    Thus, it seems either you try to change excel source structure on the fly (which does not work with Excel provider) or you may have data that does not fit requirements listed above (i.e. no long text or long text after 8 rows). I suppose you can handle this using two possible methods:

    1. Paste dummy NTEXT-size data into those columns. Saves much nerves. You can do this for very first row, so Excel provider won't be frustrated after checking column content at all.
    2. Increase row sampling setting, using the link from MSDN. Which will anyway fail, if you may not have any text in those columns.

    PS. Third method is not to use Excel provider at all. Save Excel file as CSV and work with Flat File Source, you won't be hit by this problem working with it. Excel Source is only good when you are 100% sure that source file meets all requirements and will never accidentally change its structure.