Search code examples
c#sql-serveralgorithmdata-analysis

Optimal way to analyze user qualifications data (C# and SQL server)


I need to analyze data from SQL server table. Table contains data connected with qualifications of all employyes in the company and has the following structure (simplified):

| User    | Qualification | DateOfQualificationAssignment |

| user000 | Junior        | 2014-01-15                    |

| user000 | Middle        | 2014-02-15                    |

| user001 | Middle        | 2014-02-02                    |

| user001 | Senior        | 2014-03-18                    |

| user002 | Senior        | 2014-02-19                    |

| user003 | Junior        | 2014-03-04                    |

I need the way to determine number of employees having given qualification for the concrete date. It should be some sort of analyze("Qualification", "Date") function returning the folowing for these types of input data:

  • analyze("Junior", '2014-01-20') - returns 1 (it is user user000)
  • analyze("Junior", '2014-02-20') - returns 0 (because user000 became Middle on 2014-02-15)
  • analyze("Middle", '2014-02-25') - returns 2 (because user000 and user001 are having Middle qualification on 2014-02-25)
  • analyze("Middle", '2014-03-28') - returns 1 (user000 is still Middle, but user001 became Senior on 2014-03-18)

Currently I have no idea how to handle this efficiently. What approach can be used to achieve my goal?


Solution

  • Think this should satisfy your requirements:

    create function dbo.analyze(@qualification varchar(50), @date date)
    returns int
    as
    begin
       declare @result int;
       with cte
       as
       (
           select t.*, rank() over (partition by t.[User]  order by t.DateOfQualificationAssignment desc) r
           from theTable t -- no clue what the real table is named
           where t.DateOfQualificationAssignment < @date       
       )
       select @result = count(*) 
       from cte
       where cte.r = 1 and cte.Qualification = @qualification
       return @result;
    end
    go
    

    Tested with your data:

    create table theTable
    (
        [User] varchar(50) not null,
        Qualification varchar(50) not null,
        DateOfQualificationAssignment date not null
    )
    go
    
    insert into theTable([User],Qualification,DateOfQualificationAssignment)
    values
    ('user000','Junior','20140115'),
    ('user000','Middle','20140215'),
    ('user001','Middle','20140202'),
    ('user001','Senior','20140318'),
    ('user002','Senior','20140219'),
    ('user003','Junior','20140304')
    go
    

    and the results:

    select dbo.analyze('Junior','20140120') --returns 1
    go
    
    select dbo.analyze('Junior','20140220') --returns 0
    go
    
    select dbo.analyze('Middle','20140225') --returns 2
    go
    
    select dbo.analyze('Middle','20140328') --returns 1
    go