Search code examples
azuresql-data-warehouseazure-synapse

Azure SQL Data Warehouse: max size of varchar type


I am new to Azure and I have this field in my table in Azure SQL Data Warehouse:

[AnnotationText] varchar(MAX) NULL,

Based on what I read from https://learn.microsoft.com/en-us/sql/t-sql/data-types/char-and-varchar-transact-sql?view=sql-server-2017:

  • varchar [ ( n | max ) ] Variable-size string data. Use n to define the string size in bytes and can be a value from 1 through 8,000 or use max to indicate a column constraint size up to a maximum storage of 2^31-1 bytes (2 GB).

But what I am seeing is the size of 8000 with this type of error message:

Too long string in column [-1]: Actual len = [11054]. MaxLEN=[8000]

I tried to hardcode the size but any number bigger than 8000 isn't working as written in the document.

Also I found some docs saying that HEAP or CLUSTERED INDEX type should be used in the table creation but it didn't help either

Any idea what went wrong?


Solution

  • ADF documents that (max) data types are not supported for Polybase.

    One commonly used technique is to split the file in ADF, using Polybase to bulk load the LOB data, then an alternative technique to add the LOB data later. Another technique - faster - is to split the column using ADF, and reassemble it using a view layered over the external table at the point of ingestion.

    A better approach may be to question why LOB data is required in the data warehouse. Is there an alternative approach that could be used, depending on the type of the LOB? For example, if the LOB represents a document, could it be externalised to a blob store with a link in the DW table?