I have a master table which I am ingesting data to from a couple of source tables which I am not joining.
One of the tables doesn't have one of the columns so I want to insert Nulls into this column with an 'insert into' statement. To do this, I am trying to create a @missing_column
variable.
DECLARE @missing_column string IS NULL; // This does not work.
How would you recommend doing this? I know I can write the words "NULL" as a string and insert it as that. But that would not work for a column if it decimals.
U-SQL
uses the C# type system, so you need to define types and assign values using the C# syntax. This means you will need to use a nullable type and assign the correctly typed null
values:
Because string
is a nullable type already, you can simply use:
DECLARE @missing_column string = (string)null;
Though if you were trying to use a non-nullable type such as int
, you would need to specify the nullable version in your variable and table column:
DECLARE @missing_column int? = (int?)null;
The official documentation on U-SQL DECLARE
statements can be found here.
Having re-read your question, for your specific example you should just be able to ignore the missing column in your INSERT
statement. This will assign the values from the source table where assigned and leave the additional, unspecified target column null
. To to this, you will obviously need to have a nullable type in the table column per what I have written above:
Master table definition:
CREATE TABLE dbo.MasterTarget
(
Col1 Int,
Col2 string, // string is already nullable
Col3 DateTime?, // DateTime is not, so need to specify DateTime? that accepts null values
INDEX clx_Col1 CLUSTERED(Col1 ASC)
)
DISTRIBUTED BY HASH(Col1);
All column data load:
@src1 =
EXTRACT Col1 Int,
Col2 string,
Col3 DateTime
FROM "/SourceFile1.csv"
USING Extractors.Csv();
INSERT INTO dbo.MasterTarget
(
Col1,
Col2,
Col3
)
SELECT Col1,
Col2,
Col3
FROM @src1;
Missing column data load:
@src2 =
EXTRACT Col1 Int,
Col2 string
FROM "/SourceFile2.csv"
USING Extractors.Csv();
INSERT INTO dbo.MasterTarget
(
Col1,
Col2
)
SELECT Col1,
Col2
FROM @src2;