Search code examples
sqlsql-servert-sqlsubstringcharindex

How to extract part of a string numerous times between delimiters and input into new columns T-SQL


I want to extract part of a string numerous times from a value which has a number of the same delimiters and place the extracted strings into new columns.

Here is an example of the data I am working with (these folder paths could be even longer depending on the depth of the folder):

FolderPath:

Q:\downloads\extraction\bee\honey\
Q:\desktop\chocolate\london\time\spring\
Q:\documents\cars\astonmartin\vanquish\

I want to extract the whole string between delimiters from the left and place them into new columns at different stages of the string e.g.

FolderPath RootDirectory SubDirectory SubSubDirectory
Q:\documents\cars\astonmartin\vanquish\ Q:\documents Q:\documents\cars Q:\documents\cars\astonmartin

I'm just wondering if this is possible, especially when folders names have all types of different lengths.

Any help would me much appreciated - thanks!


Solution

  • UPDATED (with a 2nd solution)

    Solution #1 - Charindex + cascading APPLY:

    --==== Sample Data
    DECLARE @folders TABLE (Folder VARCHAR(1000));
    INSERT  @folders VALUES
    ('Q:\downloads\extraction\bee\honey\'),
    ('Q:\desktop\chocolate\london\time\spring\'),
    ('Q:\documents\cars\astonmartin\vanquish\');
    
    SELECT      FolderParse.*
    FROM        @folders AS f
    CROSS APPLY
    (
      SELECT
        FolderPath      = f.Folder,
        RootDirectory   = SUBSTRING(f.Folder, 1, p2.Pos-2),
        SubDirectory    = SUBSTRING(f.Folder, 1, p3.Pos-1),
        SubSubDirectory = SUBSTRING(f.Folder, 1, p4.Pos-1)
      FROM        (VALUES(CHARINDEX('\',f.Folder)))          AS p1(Pos)
      CROSS APPLY (VALUES(CHARINDEX('\',f.Folder,p1.Pos+1))) AS p2(Pos)
      CROSS APPLY (VALUES(CHARINDEX('\',f.Folder,p2.Pos+1))) AS p3(Pos)
      CROSS APPLY (VALUES(CHARINDEX('\',f.Folder,p3.Pos+1))) AS p4(Pos)
    ) AS folderParse;
    

    Results:

    FolderPath                                RootDirectory  SubDirectory            SubSubDirectory
    ----------------------------------------- -------------- ----------------------- ---------------------------------
    Q:\downloads\extraction\bee\honey\        Q:\downloads   Q:\downloads\extraction Q:\downloads\extraction\bee
    Q:\desktop\chocolate\london\time\spring\  Q:\desktop     Q:\desktop\chocolate    Q:\desktop\chocolate\london
    Q:\documents\cars\astonmartin\vanquish\   Q:\documents   Q:\documents\cars       Q:\documents\cars\astonmartin
    

    Solution #2 - Tally Table:

    The first solution is the way to go but this second one is better for when you need to go much deeper. For this you need to grab a copy of fnTally.

    Here's how to do it against one string:

    DECLARE @string VARCHAR(1000) =  'Q:\documents\cars\astonmartin\vanquish\';
    
    SELECT
      FolderPath      = @string,
      RootDirectory   = MAX(CASE f.RN WHEN 1 THEN f.FPath END),
      SubDirectory    = MAX(CASE f.RN WHEN 2 THEN f.FPath END),
      SubSubDirectory = MAX(CASE f.RN WHEN 3 THEN f.FPath END)
    FROM
    (
      SELECT      ROW_NUMBER() OVER (ORDER BY t.N), SUBSTRING(@string,1,t.N+2)
      FROM        (VALUES(SUBSTRING(@string,4,1000))) AS s(Txt)
      CROSS APPLY dbo.fnTally(1,LEN(s.Txt))           AS t
      WHERE       SUBSTRING(s.Txt,t.N,1) = '\'
    ) AS f(RN,FPath);
    

    Returns:

    FolderPath                               RootDirectory      SubDirectory          SubSubDirectory
    ---------------------------------------- ------------------ --------------------- --------------------------------
    Q:\documents\cars\astonmartin\vanquish\  Q:\documents       Q:\documents\cars     Q:\documents\cars\astonmartin
    

    Against a table:

    --==== Sample Data
    DECLARE @folders TABLE (Folder VARCHAR(1000));
    INSERT  @folders VALUES
    ('Q:\downloads\extraction\bee\honey\'),
    ('Q:\desktop\chocolate\london\time\spring\'),
    ('Q:\documents\cars\astonmartin\vanquish\');
    
    --==== Solution
    SELECT DirectoryParse.* 
    FROM   @folders AS fld
    CROSS APPLY
    (
      SELECT
        FolderPath      = fld.Folder,
        RootDirectory   = MAX(CASE f.RN WHEN 1 THEN f.FPath END),
        SubDirectory    = MAX(CASE f.RN WHEN 2 THEN f.FPath END),
        SubSubDirectory = MAX(CASE f.RN WHEN 3 THEN f.FPath END)
      FROM
      (
        SELECT      ROW_NUMBER() OVER (ORDER BY t.N), SUBSTRING(fld.Folder,1,t.N+2)
        FROM        (VALUES(SUBSTRING(fld.Folder,4,1000))) AS s(Txt)
        CROSS APPLY dbo.fnTally(1,LEN(s.Txt))           AS t
        WHERE       SUBSTRING(s.Txt,t.N,1) = '\'
      ) AS f(RN,FPath)
    ) AS DirectoryParse;
    

    Results:

    FolderPath                                RootDirectory  SubDirectory            SubSubDirectory
    ----------------------------------------- -------------- ----------------------- ---------------------------------
    Q:\downloads\extraction\bee\honey\        Q:\downloads   Q:\downloads\extraction Q:\downloads\extraction\bee
    Q:\desktop\chocolate\london\time\spring\  Q:\desktop     Q:\desktop\chocolate    Q:\desktop\chocolate\london
    Q:\documents\cars\astonmartin\vanquish\   Q:\documents   Q:\documents\cars       Q:\documents\cars\astonmartin