Search code examples
sqlsql-serverdatesplitsqldatetime

Split column, and then convert different date types to standard date format (YYYY-MM-DD) - SQL Server


I have requirement to split column in 2 separate columns and then convert multiple date types column to standard date format column.

CREATE TABLE Report
(Id INT, Reference VARCHAR(30));

INSERT INTO Report
VALUES
(1, 'Location (11/8/22)'),
(2, 'Timesheet (11/10/22)'),
(3, 'TESTING (12/09/22)'),
(4, 'Incorrect Payment (9/10/22)'),
(5, 'Employee (11/9/22)'),
(6, 'Authorization'),
(7, 'Inactive Client'),
(8, 'Active Client (07/22/2022)'),
(9, 'TESTING (09/09/22)'),
(10, 'Timesheet')

SELECT * FROM Report
Id Reference
1 Location (11/8/22)
2 Timesheet (11/10/22)
3 TESTING (12/09/22)
4 Incorrect Payment (9/10/22)
5 Employee (11/9/22)
6 Authorization
7 Inactive Client
8 Active Client (07/22/2022)
9 TESTING (09/09/22)
10 Timesheet

The output I need:

Id Reference Type Date
1 Location (11/8/22) Location 2022-11-08
2 Timesheet (11/10/22) Timesheet 2022-11-10
3 TESTING (12/09/22) TESTING 2022-12-09
4 Incorrect Payment (9/10/22) Incorrect Payment 2022-09-10
5 Employee (11/9/22) Employee 2022-11-09
6 Authorization Authorization NULL
7 Inactive Client Inactive Client NULL
8 Active Client (07/22/2022) Active Client 2022-07-22
9 TESTING (09/09/22) TESTING 2022-09-09
10 Timesheet Timesheet NULL

I was able to split the Reference column, however can't find the way to remove "(" and ")" and convert it to Standard Date format. Is it even possible to convert the way I need, because date information is from manual user input, therefore 2022-09-08 can be present in a multiple ways, such as: 09/08/2022, 09/08/22, 09/8/22, 9/8/2022 and any other possible ways.

SELECT 

   p.*

   ,SUBSTRING(p.Reference, 1, CASE CHARINDEX('(', p.Reference)
        WHEN 0
            THEN LEN(p.Reference)
        ELSE CHARINDEX('(', p.Reference) - 1
        END) AS Type

   ,SUBSTRING(p.Reference, CASE CHARINDEX('(', p.Reference)
        WHEN 0
            THEN LEN(p.Reference) + 1
        ELSE CHARINDEX('(', p.Reference) + 0
        END, 1000) AS Date

FROM Report AS p
Id Reference Type Date
1 Location (11/8/22) Location (11/8/22)
2 Timesheet (11/10/22) Timesheet (11/10/22)
3 TESTING (12/09/22) TESTING (12/09/22)
4 Incorrect Payment (9/10/22) Incorrect Payment (9/10/22)
5 Employee (11/9/22) Employee (11/9/22)
6 Authorization Authorization
7 Inactive Client Inactive Client
8 Active Client (07/22/2022) Active Client (07/22/2022)
9 TESTING (09/09/22) TESTING (09/09/22)
10 Timesheet Timesheet

Solution

  • SELECT Id, Reference, 
           MAX(CAST(CASE CHARINDEX(')', value) 
                       WHEN 0 THEN  NULL
                       ELSE REPLACE(value, ')', '')
                    END AS DATE)) AS TheDate
    FROM   Report
           OUTER APPLY STRING_SPLIT(Reference, '(')
    GROUP  BY Id, Reference;
    

    As you can see your table violate the first normal form (1FN) by having non atomic values into a column... Then this is not a relational database but something close to CoBOL files in the old times...

    Then querying is difficult and performances will be poor. You better have to normalize your DB !