Search code examples
sql-servergroup-bydistinctgroupinghaving

Problèm with SQL select grouping


I have a small problem with a SQL Server query.

I have an issue with my view of several base tables with duplicate values, so far no problem, these duplicates are logical. By unfortunately I do not get the desired end result, I could do it by programming the front end of my application but I would prefer to do the work on the server.

I will explain the principle: I have 30 companies which each have an employee table. My view is a union of the 30 employee tables. Each employee has a unique serial number, the number is the same across tables, so an employee named "John Doe" with an ID number 'S0000021' can be hired in Company A then transferred to company Q without any problems, it will retain the serial number 'S0000021'.

The difference between the data from the Employee tables A and Q will be in this example the start (hire) and release (transfer) dates entered for Company A and just the start date for company Q so the view will have 2 lines for "John Doe".

12 common fields are the following:

  • Serial Number (Identical in every employee table)
    • Social Security Number (Same in every employee table)
    • Start/Hire Date
    • Release/Transfer date (empty/null if the employee is current)
    • Name (Can change across companies if the person divorces)
    • First name
    • Maiden name
    • Last Name
    • Gender
    • Final Released
    • Company Code

The problem seems simple that I would not appear that the latest information of the employee, except with a group by, if it has changed name or release date, it will be displayed twice.

I tried the following different ways but they don't return what I want

I returned results both ways but I always see duplicates because my dates within companies are never identical, and their name may change.

Sorry for this Google translation.

1 --

   select 
        vue.matricule,
        vue.numsecu,
        vue.name,
        vue.lastname,
        vue.maidenname,
        vue.secondname,
        vue.genre,
        vue.released,
        vue.companycode

    from
        vue

    group by 
        vue.matricule,
        vue.numsecu,
        vue.name,
        vue.lastname,
        vue.maidenname,
        vue.secondname,
        vue.genre,
        vue.released,
        vue.companycode

2---

select 
    distinct(vue.matricule),
    vue.numsecu,
    vue.name,
    vue.lastname,
    vue.maidenname,
    vue.secondname,
    vue.genre,
    vue.released,
    vue.companycode

from
    vue

Solution

  • I assumed the following:

    • there is a view (vue) that already gathers all data from each of the 30 companies
    • you are just looking for the latest record for each employee

    If you need to also see a record for each name change we can change this.

    --set up test data
    declare @vue table (
        matricule       varchar(20),
        numsecu         varchar(20),
        name            varchar(20),
        lastname        varchar(20),
        maidenname      varchar(20),
        secondname      varchar(20),
        genre           varchar(20),
        start           datetime,
        released        datetime,
        companycode     varchar(20));
    
    insert @vue values
        ('S0000021','123456789','John', 'Doe',null,null,'M','2015-01-01','2015-12-31','A'),
        ('S0000021','123456789','Johnny', 'Doe',null,null,'M','2016-01-01',null,'Q'),       --new company, name change, currently employed
        ('S0000022','123456780','Jane', 'Doe',null,null,'M','2015-01-01','2015-12-31','A'),
        ('S0000022','123456780','Jane', 'Doe',null,null,'M','2016-01-01','2016-02-01','Q'); --new company, name change, terminated
    
    select * from @vue order by matricule, start;
    
    --get latest record for each employee
    select  *
    from    (--add row numbering 
            select  *, row_number() over (partition by matricule order by start desc) row_num
            from    @vue
            ) vue2
    where   vue2.row_num = 1;