Search code examples
sqlsql-servertransformationstreet-addressssms-2017

I get this error Message "Invalid length parameter passed to the LEFT or SUBSTRING function."


enter image description hereThis code should return the street address without the street number. These EU address have their street number at the end of the address. I am not sure why the error is happening.

    UPDATE STAGING_1_1_FACT_CUSTOMERS_B2B_LGP                       
        SET [StreetAddress] = SUBSTRING([Address], 1, PATINDEX('%[1-9]%', [Address])-1)             
    FROM    [dbo].[STAGING_1_1_FACT_CUSTOMERS_B2B_LGP]
    WHERE [Country Code] IN ('NL','DE','LT','AT','BE','ES','DK','IT', 'SE', 'CZ', 'SI', 'SUI', 'EE','PL','HU','LIE','FI','LV')

Solution

  • Identify rows without a number in the address:

    SELECT * FROM dbo.STAGING_1_1_FACT_CUSTOMERS_B2B_LGP -- wow
    WHERE PATINDEX('%[1-9]%', [Address]) = 0;
    

    To get the entire address when a number doesn't occur, you can use:

    SUBSTRING(Address, 1, COALESCE(NULLIF(
      PATINDEX('%[1-9]%', [Address]), 0),LEN(Address)+1)-1)
    

    Which - finding no number - will add 1 to the length so you can still subtract 1 to get the whole string. That's assuming you want the whole string in that case.

    In order to perform the update you're still going to have to prepare for garbage data that you obviously have (or you wouldn't be here) but that you didn't include in your screenshot (also don't post data as screenshots). Given this sample data:

    CREATE TABLE dbo.STAGING_1_1_FACT_CUSTOMERS_B2B_LGP
    (
      Address       nvarchar(255),
      StreetNumber  nvarchar(255),
      StreetAddress nvarchar(255)
    );
    
    INSERT dbo.STAGING_1_1_FACT_CUSTOMERS_B2B_LGP(Address) 
    VALUES(N'Gewerbegebiet 5'),
    (N'Spännigweg 1'),
    (N'Hauptstr 113A'),
    (N'Viale Francesco Redi 39'),
    (N'Garbage your code does not handle.'),
    (N'More garbage 20th promenade 225 W');
    

    You can run the following update:

    ; /* <--- ensure previous statement terminated */ 
    WITH src AS
    (
      SELECT *, FirstNumber = 
        COALESCE(NULLIF(PATINDEX('%[1-9]%', [Address]), 0),LEN(Address)+1)
      FROM dbo.STAGING_1_1_FACT_CUSTOMERS_B2B_LGP
      -- WHERE CountryCode IN ('some', 'list');
    )
    UPDATE src SET 
      StreetNumber  = SUBSTRING(Address, FirstNumber, 255),
      StreetAddress = LEFT(Address, FirstNumber-1);
    

    Output (which shows what happens to garbage):

    Address StreetNumber StreetAddress
    Gewerbegebiet 5 5 Gewerbegebiet
    Spännigweg 1 1 Spännigweg
    Hauptstr 113A 113A Hauptstr
    Viale Francesco Redi 39 39 Viale Francesco Redi
    Garbage your code does not handle. Garbage your code does not handle.
    More garbage 20th promenade 225 W 20th promenade 225 W More garbage

    Also you don't need the FROM line in the update. You're updating the same table.

    Finally, the requirement makes little sense to me.

    • Why do you want StreetAddress to be everything up to but not including the number?
    • What happens if there is a number in a street name?
    • If you're trying to clean up address data, there is very expensive software that does this and still isn't perfect, so trying to re-invent the wheel is going to lead to lots of little frustrating issues like this one.