Search code examples
sql-serverparsingazure-data-factorydelimited

Parsing string with multiple delimiters into columns


I want to split strings into columns.

My columns should be:

account_id, resource_type, resource_name

I have a JSON file source that I have been trying to parse via ADF data flow. That hasn't worked for me, hence I flattened the data and brought it into SQL Server (I am open to parsing values via ADF or SQL if anyone can show me how). Please check the JSON file at the bottom.

Use this code to query the data I am working with.

 CREATE TABLE test.test2
 (
     resource_type nvarchar(max) NULL
 )

 INSERT INTO test.test2 ([resource_type]) 
 VALUES 
     ('account_id:224526257458,resource_type:buckets,resource_name:camp-stage-artifactory'),
     ('account_id:535533456241,resource_type:buckets,resource_name:tni-prod-diva-backups'),
     ('account_id:369798452057,resource_type:buckets,resource_name:369798452057-s3-manifests'),
     ('account_id:460085747812,resource_type:buckets,resource_name:vessel-incident-report-nonprod-accesslogs')

The output that I should be able to query in SQL Server should like this:

account_id resource_type resource_name
224526257458 buckets camp-stage-artifactory
535533456241 buckets tni-prod-diva-backups

and so forth.

Please help me out and ask for clarification if needed. Thanks in advance.

EDIT:

Source JSON Format:

{
    "start_date": "2021-12-01 00:00:00+00:00",
    "end_date": "2021-12-31 23:59:59+00:00",
    "resource_type": "all",
    "records": [
        {
            "directconnect_connections": [
                "account_id:227148359287,resource_type:directconnect_connections,resource_name:'dxcon-fh40evn5'",
                "account_id:401311080156,resource_type:directconnect_connections,resource_name:'dxcon-ffxgf6kh'",
                "account_id:401311080156,resource_type:directconnect_connections,resource_name:'dxcon-fg5j5v6o'",
                "account_id:227148359287,resource_type:directconnect_connections,resource_name:'dxcon-fgvfo1ej'"
            ]
        },
        {
            "virtual_interfaces": [
                "account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fgvj25vt'",
                "account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fgbw5gs0'",
                "account_id:401311080156,resource_type:virtual_interfaces,resource_name:'dxvif-ffnosohr'",
                "account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fg18bdhl'",
                "account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-ffmf6h64'",
                "account_id:390251991779,resource_type:virtual_interfaces,resource_name:'dxvif-fgkxjhcj'",
                "account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-ffp6kl3f'"
            ]
        }
    ]
}

Solution

  • Since you don't have a valid JSON string and not wanting to get in the business of string manipulation... perhaps this will help.

    Select B.*
     From  test2 A
     Cross Apply ( Select account_id    = max(case when value like 'account_id:%'    then stuff(value,1,11,'') end )
                         ,resource_type = max(case when value like 'resource_type:%' then stuff(value,1,14,'') end )
                         ,resource_name = max(case when value like 'resource_name:%' then stuff(value,1,14,'') end )
                    from  string_split(resource_type,',') 
                 )B
    

    Results

    account_id      resource_type   resource_name
    224526257458    buckets         camp-stage-artifactory
    535533456241    buckets         tni-prod-diva-backups
    369798452057    buckets         369798452057-s3-manifests
    460085747812    buckets         vessel-incident-report-nonprod-accesslogs