i have a stored procedure in sql server:
CREATE OR ALTER PROCEDURE dbo.basic_json_normalization_full
@tableNamep sysname,
@tableName sysname,
@columns_list NVARCHAR(MAX),
@json_query NVARCHAR(MAX),
@rows_count INT OUTPUT AS BEGIN
SET NOCOUNT, XACT_ABORT ON;
DECLARE @main_tableName NVARCHAR(MAX) = 'dbo.' + @tableName;
DECLARE @raw_tableName NVARCHAR(MAX) = 'dbo.' + 'raw' + @tableNamep;
-- Truncate the main table DECLARE @truncateQuery NVARCHAR(MAX);
SET @truncateQuery = 'TRUNCATE TABLE dbo.' + @tableName;
EXECUTE sp_executesql @truncateQuery;
-- Construct the dynamic SQL
DECLARE @sql NVARCHAR(MAX);
SET @sql = N'
BEGIN TRAN;
INSERT INTO ' + @main_tableName + '
(_id, ORIGINAL_ID, ' + @columns_list + ')
SELECT j.[_id], j.[ORIGINAL_ID], ' + @columns_list + '
FROM ' + @raw_tableName + ' r
CROSS APPLY OPENJSON(r.raw_data)
WITH ( _id nvarchar(155) ''$._id'', ORIGINAL_ID NVARCHAR(150) ''$.ORIGINAL_ID'', ' + @json_query + ') AS j;
SET @rows_count = @@ROWCOUNT;
COMMIT TRAN;
';
-- Execute the dynamic SQL
EXEC sp_executesql @sql, N'@rows_count int OUTPUT', @rows_count = @rows_count OUTPUT; END; GO
this procedure is consumed in a spring batch tasklet, more specifically in a method called JsonNormalize
:
@Override
public int jsonNormalize(JdbcTemplate jdbcTemplate, String tableName, String chargement) {
return simpleJdbcCall.withProcedureName("basic_json_normalization_full")
.withSchemaName("dbo")
.declareParameters(new SqlParameter("@tableName", Types.NVARCHAR))
.declareParameters(new SqlParameter("@tableNamep", Types.NVARCHAR))
.declareParameters(new SqlParameter("@columns_list", Types.NVARCHAR))
.declareParameters(new SqlParameter("@json_query", Types.NVARCHAR))
.execute(in);
}
and this is the tasklet snippet code:
@Bean
public Step BasicNormalization(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilder(String.join("", BASIC_NORMALIZATION), jobRepository)
.tasklet(basicNormalizationTasklet(), transactionManager)
.throttleLimit(5)
.build();
}
is there any way to make this one faster a bit, because, till now, it takes 4 hours to treat 1.2M rows.
PS: I tried executing this in SQL Server Management Studio and it takes too long too, so is there any way to use the multi-threading or partitionning option in spring batch to make it faster .
Multi-threading won't improve the performance of the step and might even produce incorrect results as multiple threads would be executing the same procedure concurrently on the same data set.
What you can try is partition the input dataset and configure a partitioned step where each worker processes a distinct partition (ie each thread will call the procedure on a different dataset). This way, you would have different workers processing distinct datasets in parallel (not the same dataset concurrently).