Search code examples
c#filehelpers

How to export large SQL Server table into a CSV file using the FileHelpers library?


I'm looking to export a large SQL Server table into a CSV file using C# and the FileHelpers library. I could consider C# and bcp as well, but I thought FileHelpers would be more flexible than bcp. Speed is not a special requirement. OutOfMemoryException is thrown on the storage.ExtractRecords() when the below code is run (some less essential code has been omitted):

  SqlServerStorage storage = new SqlServerStorage(typeof(Order));
    storage.ServerName = "SqlServer"; 
    storage.DatabaseName = "SqlDataBase";
    storage.SelectSql = "select * from Orders";
    storage.FillRecordCallback = new FillRecordHandler(FillRecordOrder);
    Order[] output = null;
    output = storage.ExtractRecords() as Order[];

When the below code is run, 'Timeout expired' is thrown on the link.ExtractToFile():

 SqlServerStorage storage = new SqlServerStorage(typeof(Order));
    string sqlConnectionString = "Server=SqlServer;Database=SqlDataBase;Trusted_Connection=True";
    storage.ConnectionString = sqlConnectionString;
    storage.SelectSql = "select * from Orders";
    storage.FillRecordCallback = new FillRecordHandler(FillRecordOrder);
    FileDataLink link = new FileDataLink(storage);
    link.FileHelperEngine.HeaderText = headerLine;
    link.ExtractToFile("file.csv");

The SQL query run takes more than the default 30 sec and therefore the timeout exception. Unfortunately, I can't find in the FileHelpers docs how to set the SQL Command timeout to a higher value.

I could consider to loop an SQL select on small data sets until the whole table gets exported, but the procedure would be too complicated. Is there a straightforward method to use FileHelpers on large DB tables export?


Solution

  • FileHelpers has an async engine which is better suited for handling large files. Unfortunately, the FileDataLink class does not use it, so there's no easy way to use it with SqlStorage.

    It's not very easy to modify the SQL timeout either. The easiest way would be to copy the code for SqlServerStorage to create your own alternative storage provider and provide replacements for ExecuteAndClose() and ExecuteAndLeaveOpen() which set the timeout on the IDbCommand. (SqlServerStorage is a sealed class, so you cannot just subclass it).

    You might want to check out ReactiveETL which uses the FileHelpers async engine for handling files along with a rewrite of Ayende's RhinoETL using ReactiveExtensions to handle large datasets.