This is a commonly seen ADO.NET pattern to retrieve data from database using a data reader, but strangely doesn't work.
Doesn't work:
public static IEnumerable<IDataRecord> SelectDataRecord<T>(string query, string connString)
where T : IDbConnection, new()
{
using (var conn = new T())
{
using (var cmd = conn.CreateCommand())
{
cmd.CommandText = query;
cmd.Connection.ConnectionString = connString;
cmd.Connection.Open();
using (var reader = (DbDataReader)cmd.ExecuteReader())
{
// the main part
while (reader.Read())
{
yield return (IDataRecord)reader;
}
}
}
}
This does work:
public static IEnumerable<IDataRecord> SelectDataRecord<T>(string query, string connString)
where T : IDbConnection, new()
{
using (var conn = new T())
{
using (var cmd = conn.CreateCommand())
{
cmd.CommandText = query;
cmd.Connection.ConnectionString = connString;
cmd.Connection.Open();
using (var reader = (DbDataReader)cmd.ExecuteReader())
{
// the main part
foreach (var item in reader.Cast<IDataRecord>())
{
yield return item;
}
}
}
}
The only relevant change I see is that in the first code the iterator is returned from while
loop while in the second it is returned from a foreach
loop.
I call it like:
// I have to buffer for some reason
var result = SelectDataRecord<SQLiteConnection>(query, connString).ToList();
foreach(var item in result)
{
item.GetValue(0); // explosion
}
I tried with SQLite .NET connector as well as MySQL connector. The result is the same, i.e. first approach fails, second one succeeds.
Exception
SQLite
An unhandled exception of type 'System.InvalidOperationException' occurred in System.Data.SQLite.dll. Additional information: No current row
MySQL
An unhandled exception of type 'System.Exception' occurred in MySql.Data.dll. Additional information: No current query in data reader
Is it because of the implementation differences between reader.Read
and reader.GetEnumerator
in the specific ADO.NET connectors? I couldn't see any noticeable difference when I checked the source of System.Data.SQLite
project, GetEnumerator
calls Read
internally. I am assuming ideally in both cases the yield
keyword prevents eager execution of the method and the loops have to be executed only once the enumerator is enumerated externally.
Update:
I use this pattern to be on the safe side (essentially the same as the second approach but a little less verbose),
using (var reader = cmd.ExecuteReader())
foreach (IDataRecord record in reader as IEnumerable)
yield return record;
The difference between the two examples is because foreach
has different semantics from while
which is a plain loop. The underlying GetEnumerator
of foreach
makes the difference here.
As Joel says, in the first example, the same reader object is yielded on each iteration of the while
loop. This is because the both the IDataReader
as well as IDataRecord
are the same here, which is unfortunate. When a ToList
is called on the resulting sequence, the yielding is complete upon which the using
blocks closes the reader and connection objects, and you end up with a list of disposed reader objects of the same reference.
In the second example, the foreach
on data reader ensures a copy of IDataRecord
is yielded. The GetEnumerator
is implemented like this:
public IEnumerator GetEnumerator()
{
return new DbEnumerator(this); // the same in MySQL as well as SQLite ADO.NET connectors
}
where MoveNext
of System.Data.Common.DbEnumerator
class is implemented like:
IDataRecord _current;
public bool MoveNext() // only the essentials
{
if (!this._reader.Read())
return false;
object[] objArray = new object[_schemaInfo.Length];
this._reader.GetValues(objArray); // caching into obj array
this._current = new DataRecordInternal(_schemaInfo, objArray); // a new copy made here
return true;
}
The DataRecordInternal
is the actual implementation of the IDataRecord
which is yielded from the foreach
which is not the same reference as the reader, but a cached copy of all the values of the row/record.
The System.Linq.Cast
in this case is a mere representation preserving cast which does nothing to the overall effect. Cast<T>
will be implemented like this:
public static IEnumerable<T> Cast<T>(this IEnumerable source)
{
foreach (var item in source)
yield return (T)item; // representation preserving since IDataReader implements IDataRecord
}
An example without Cast<T>
call can be shown to not exhibit this problem.
using (var reader = cmd.ExecuteReader())
foreach (var record in reader as IEnumerable)
yield return record;
The above example just works fine.
An important distinction to make is that the first example is problematic only if you are not making use of the values read from database in its first enumeration itself. It is only the subsequent enumerations which throw since reader will be disposed by then. An eg,
using (var reader = cmd.ExecuteReader())
while (reader.Read())
yield return reader;
...
foreach(var item in ReaderMethod())
{
item.GetValue(0); // runs fine
}
...
foreach(var item in ReaderMethod().ToList())
{
item.GetValue(0); // explosion
}