The result of a query returns the same value for COLUMN_NAME from an Entity Framework query, which doesn't make sense

I've been struggling with this problem for days. A colleague wrote a LINQ query against a SQL Server view that he also created, which returns 18 rows for the table name he's filtering on.

The view he created uses the system view INFORMATION_SCHEMA. The weird thing is that when the LINQ query returns the result, all the values for the COLUMN_NAME are ID.

However, when I run a SELECT against the same view, all the COLUMN_NAMES are different. I do not understand why there's a discrepancy.

Here's the SQL Server view:

CREATE view [Core].[vwDataDictionary] 
AS
    SELECT 
        ROW_NUMBER() OVER(order by A.[TABLE_CATALOG]) AS [ID]   
        ,A.[TABLE_CATALOG]
        ,A.[TABLE_SCHEMA]
        ,A.[TABLE_NAME]
        ,A.[COLUMN_NAME]
        ,[ORDINAL_POSITION]
        ,[COLUMN_DEFAULT]
        ,[IS_NULLABLE]
        ,[DATA_TYPE]
        ,[CHARACTER_MAXIMUM_LENGTH]
        ,[CHARACTER_OCTET_LENGTH]
        ,[NUMERIC_PRECISION]
        ,[NUMERIC_PRECISION_RADIX]
        ,[NUMERIC_SCALE]
        ,[DATETIME_PRECISION]
        ,[CHARACTER_SET_CATALOG]
        ,[CHARACTER_SET_SCHEMA]
        ,[CHARACTER_SET_NAME]
        ,[COLLATION_CATALOG]
        ,[COLLATION_SCHEMA]
        ,[COLLATION_NAME]
        ,[DOMAIN_CATALOG]
        ,[DOMAIN_SCHEMA]
        ,[DOMAIN_NAME]
        ,B.[CONSTRAINT_NAME]
        ,C.[CONSTRAINT_TYPE]
    FROM 
        [INFORMATION_SCHEMA].[COLUMNS] AS A
    LEFT JOIN 
        [INFORMATION_SCHEMA].[CONSTRAINT_COLUMN_USAGE] AS B ON A.TABLE_NAME = B.TABLE_NAME 
                                                            AND A.COLUMN_NAME = B.COLUMN_NAME
    LEFT JOIN
        [INFORMATION_SCHEMA].[TABLE_CONSTRAINTS] AS C ON B.CONSTRAINT_NAME = C.CONSTRAINT_NAME
    WHERE 
        A.[TABLE_SCHEMA] = 'App'

He tends to declare a global DbContext to use. I thought that was the problem, so I changed the code only slightly to use a local DbContext. Unfortunately, it didn't help as the changes I made are identical to what his code made.

With only that difference, the rest of this C# code is his:

using (var ctx = new CoreFrameworkEntities())
{
    var dd_fields_query = ctx.vwDataDictionaries.Where(d => d.TABLE_NAME == CurrentTableName);
    var dd_fields_query_list = dd_fields_query.ToList();

    foreach (var item in dd_fields_query_list)
    {
        string col_name = "";

        foreach (var camel in SplitCamelCase(item.COLUMN_NAME))
        {
            col_name = col_name + camel + " ";
        }

        col_name = col_name.Replace("_", "").Trim();

        if (item.COLUMN_NAME == "ID")
        {
            col_name = "ID";
        }
        else if (item.COLUMN_NAME == "InstrumentID")
        {
            col_name = "Instrument ID";
        }
        else if (item.COLUMN_NAME.EndsWith("ID"))
        {
            col_name = item.COLUMN_NAME.Split('I')[0] + " ID";
        }

        KeyValuePair<vwDataDictionary, string> kvp = new KeyValuePair<vwDataDictionary, string>(item, col_name);
        TableFields.Add(kvp);
    }
}

And here's the SELECT I wrote, to compare the results against what the LINQ query is yielding:

SELECT * 
FROM Core.vwDataDictionary
WHERE TABLE_NAME = 'Proficiency'

So, why is it that the LINQ query returns 18 rows, all with a value of "ID", whereas the SELECT returns 18 rows all with the correct column names for the Proficiency table?

Solution

There is a subtle problem with views when used from Entity Framework.

If you have a table with EF, you need to have a primary key to uniquely identify each row. Typically, that's a single column, e.g. an ID or something like that.

With a view, you don't have the concept of a "primary key" - the view just contains some columns from some tables.

So when EF maps a view, it cannot find a primary key - and therefore, it will use all non-nullable columns from the view as "substitute" primary key.

When EF now reads the data, it will get all the columns and create an in-memory object representing that row. If EF now later on reads another row from the database where those non-nullable columns that make up the substitute PK of your view are the same - then it'll think: "gee, I already have that row" and just add another copy of the same object into your result set.

So in this case, in the end you might end up having 18 identical rows in your EF result set - even though the SQL Server output properly shows different data. ......

UPDATE: as a possible solution, you could try to tap in the sys.columns and sys.tables catalog views which offer "better" columns - non-nullable ones, that aren't all the same for each column....

Try something like this:

CREATE VIEW [Core].[vwDataDictionary] 
AS
    SELECT 
        t.Name,
        t.object_id,
        c.Name,
        c.column_id
        -- possibly later more columns here....
    FROM
        sys.tables t
    INNER JOIN
        sys.columns c ON c.object_id = t.object_id