Search code examples
c#linqchainingenumerable

Enumerable Chaining and Reset


I'm trying to import a file into a database and learn a more efficient way of doing things along the way. This article suggested chaining enumerations yields low memory usage and good performance. This is my first time chaining several enumerations, and I'm not quite sure how to handle a reset appropriately...

Short story: I have an enumeration which reads from a text file and maps to a DTO (see the Map Function), a Where enumerator, followed by an Import that takes an enumeration... It all works perfectly, except that when the filter returns 0 records... In that case, SQL errors saying System.ArgumentException: There are no records in the SqlDataRecord enumeration....

So I put a if(!list.Any()) return; at the top of my Import method, and it seems to work not error. Except it will always skip all the rows up to (and including) the first Valid row in the text file...

Do I need to somehow Reset() the enumerator after my Any() call? Why is this not necessary when the same struct is used in Linq and other Enumerable implementations?

Code:

    public IEnumerable<DTO> Map(TextReader sr)
    {
        while (null != (line = sr.ReadLine()))
        {
            var dto = new DTO();
            var row = line.Split('\t');
            // ... mapping logic goes here ...

            yield return (T)obj;
        }
    }

    //Filter, called elsewhere 
    list.Where(x => Valid(x)).Select(x => LoadSqlRecord(x))

    public void Import(IEnumerable<SqlDataRecord> list)
    {
        using (var conn = new SqlConnection(...))
        {
            if (conn.State != ConnectionState.Open)
                conn.Open();

            var cmd = conn.CreateCommand();
            cmd.CommandText = "Import_Data";
            cmd.CommandType = System.Data.CommandType.StoredProcedure;

            var parm = new SqlParameter();
            cmd.Parameters.Add(parm);
            parm.ParameterName = "Data"
            parm.TypeName = "udt_DTO";
            parm.SqlDbType = SqlDbType.Structured;
            parm.Value = list;

            cmd.ExecuteNonQuery();
            conn.Close();
        }
    }

Sorry for the long example. Thanks for reading...


Solution

  • The issue you are seeing is likely not because of the IEnumerable/IEnumerator interfaces, but rather with the underlying resources you are using to produce values.

    For most enumerables, adding a list.Any() would not cause future enumerations of list to skip items because each call to list.GetEnumerator returns independent objects. You may have other reasons to not want to make multiple enumerators, such as multiple calls to the database via LINQ to SQL, but it every enumerator will get all the items.

    In this case however, making multiple enumerators is not working because of the underlying resources. Based on the code you posted, I assume that the parameter passed to Import is based on a call to the Map method you posted. Each time through the enumerable returned from Map, you will "start" at the top of the method, but the TextReader and its current position is shared between all enumerators. Even if you did try to call Reset on the IEnumerators, this would not reset the TextReader. To solve your problem, you either need buffer the enumerable (eg ToList) or find a way to reset the TextReader.