I'm hoping for a fix to this that doesn't use a cursor (unless necessary).
We receive holdings submissions from our Investments quarterly. They are loaded by Reference Date, Investment, Holding and currency type making each row distinct. This data is put in a database in MSFT SQL Server. Sometimes an Investment does not submit their holdings for that quarter. In that case, we would like to pull in the latest available quarter's holdings (This could be 3 or 4 quarters ago) so there is no gap when performing trending analysis over multiple reference dates.
In addition to the gaps, we need to bring forward the data from the last submission to the MAX([Reference Date]) for the entire table. For the below, this date is 6/30/2021.
We have a date dimension table. For submissions brought forward, the [Data Availability] column should show 'Include Latest from Previous Submissions'.
Issue is shown in first DDL (fix is shown in 2nd DDL example):
Issues:
Fix:
Sample data before fix:
CREATE TABLE dbo.TestInvestments
( [Reference Date] DATE, [Investment] [nvarchar](1000) NOT NULL, [holding] [nvarchar](1000) NOT NULL, [Currency Type] [varchar](100) NULL, [Data Availability] varchar(22), [Current Value] [float] NULL, [Current Cost] [float] NULL, [Realized] [Float] NULL)
;
INSERT INTO TestInvestments
( [Reference Date], [Investment], [holding], [Currency Type], [Data Availability], [Current Value], [Current Cost], [Realized])
VALUES
('2019-12-31', 'Investment 1', 'Quality Care', 'LOCAL', 'Current Qtr Submission', 0, 0, 15757000),
('2019-12-31', 'Investment 1', 'Quality Care', 'USD', 'Current Qtr Submission', 0, 0, 15757000),
('2019-12-31', 'Investment 1', 'TransUnion', 'LOCAL', 'Current Qtr Submission', 0, 0, 631410000),
('2019-12-31', 'Investment 1', 'TransUnion', 'USD', 'Current Qtr Submission', 0, 0, 631410000),
('2020-03-31', 'Investment 1', 'Quality Care', 'LOCAL', 'Current Qtr Submission', 0, 0, 15757000),
('2020-03-31', 'Investment 1', 'Quality Care', 'USD', 'Current Qtr Submission', 0, 0, 15757000),
('2020-12-31', 'Investment 2', 'West Corp.', 'LOCAL', 'Current Qtr Submission', 63872528, 54756087, 0),
('2020-12-31', 'Investment 2', 'West Corp.', 'USD', 'Current Qtr Submission', 63872528, 54756087, 0),
--End of Addition
('2021-06-30', 'Investment 3', 'Tinkers', 'USD', 'Current Qtr Submission', 12536, 26541, 0)
;
Expected Results after fix would look like this:
CREATE TABLE dbo.TestInvestmentsFixed
( [Reference Date] DATE, [Investment] [nvarchar](1000) NOT NULL, [holding] [nvarchar](1000) NOT NULL, [Currency Type] [varchar](100) NULL, [Data Availability] varchar(100), [Current Value] [float] NULL, [Current Cost] [float] NULL, [Realized] [Float] NULL)
;
INSERT INTO TestInvestmentsFixed
( [Reference Date], [Investment], [holding], [Currency Type], [Data Availability], [Current Value], [Current Cost], [Realized])
VALUES
('2019-12-31', 'Investment 1', 'Quality Care', 'LOCAL', 'Current Qtr Submission', 0, 0, 15757000),
('2019-12-31', 'Investment 1', 'Quality Care', 'USD', 'Current Qtr Submission', 0, 0, 15757000),
('2019-12-31', 'Investment 1', 'TransUnion', 'LOCAL', 'Current Qtr Submission', 0, 0, 631410000),
('2019-12-31', 'Investment 1', 'TransUnion', 'USD', 'Current Qtr Submission', 0, 0, 631410000),
('2020-03-31', 'Investment 1', 'Quality Care', 'LOCAL', 'Current Qtr Submission', 0, 0, 15757000),
('2020-03-31', 'Investment 1', 'Quality Care', 'USD', 'Current Qtr Submission', 0, 0, 15757000),
--Data added here. TransUnion not brought forward from 12/31/19
('2020-06-30', 'Investment 1', 'Quality Care', 'LOCAL', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2020-06-30', 'Investment 1', 'Quality Care', 'USD', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2020-09-30', 'Investment 1', 'Quality Care', 'LOCAL', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2020-09-30', 'Investment 1', 'Quality Care', 'USD', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2020-12-31', 'Investment 1', 'Quality Care', 'LOCAL', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2020-12-31', 'Investment 1', 'Quality Care', 'USD', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2021-03-31', 'Investment 1', 'Quality Care', 'LOCAL', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2021-03-31', 'Investment 1', 'Quality Care', 'USD', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2021-06-30', 'Investment 1', 'Quality Care', 'LOCAL', 'Include Latest from Previous Submissions', 0, 0, 15757000),
('2021-06-30', 'Investment 1', 'Quality Care', 'USD', 'Include Latest from Previous Submissions', 0, 0, 15757000),
--End of Addition
('2020-12-31', 'Investment 2', 'West Corp.', 'LOCAL', 'Current Qtr Submission', 63872528, 54756087, 0),
('2020-12-31', 'Investment 2', 'West Corp.', 'USD', 'Current Qtr Submission', 63872528, 54756087, 0),
--This data is added
('2021-03-31', 'Investment 2', 'West Corp.', 'LOCAL', 'Include Latest from Previous Submissions', 63872528, 54756087, 0),
('2021-03-31', 'Investment 2', 'West Corp.', 'USD', 'Include Latest from Previous Submissions', 63872528, 54756087, 0),
('2021-06-30', 'Investment 2', 'West Corp.', 'LOCAL', 'Include Latest from Previous Submissions', 63872528, 54756087, 0),
('2021-06-30', 'Investment 2', 'West Corp.', 'USD', 'Include Latest from Previous Submissions', 63872528, 54756087, 0),
--End of Addition
('2021-06-30', 'Investment 3', 'Tinkers', 'USD', 'Current Qtr Submission', 12536, 26541, 0)
;
Current solution (not working/and is a cursor): I think this is too much to not be in a fiddle, but I recreated this from the solution I am currently using. This brings forward holdings it shouldn't due to bringing forward holdings like TransUnion. There appears to be some other issues with this recreated script that I'll go through:
/* ******************* USD Only in this section - below USD is the Local ******************************* */
IF OBJECT_ID('TEMPDB.dbo.#TestInvestments_USD_Only') IS NOT NULL DROP TABLE #TestInvestments_USD_Only
IF OBJECT_ID('TEMPDB.dbo.#nums') IS NOT NULL DROP TABLE #nums
IF OBJECT_ID('TEMPDB.dbo.#TestInvestments_LatestAvailable_USD') IS NOT NULL DROP TABLE #TestInvestments_LatestAvailable_USD
/*create temp table for USD Holdings only - Local Currency is unioned later*/
Select * into #TestInvestments_USD_Only from TestInvestments where [Currency Type] = 'USD'
/* Table to contain the missing holdings*/
CREATE TABLE #TestInvestments_LatestAvailable_USD(
[Reference Date] date NOT NULL
,Investment nvarchar (200) NOT NULL
,[Holding Key] int NOT NULL
,Holding nvarchar (200) NOT NULL
,[Currency Type] [varchar](100) NULL
,[Data Availability] varchar(100) NULL
,[Current Value] [float] NULL
,[Current Cost] [float] NULL
,[Realized] [float] NULL
,primary key(Investment, [Holding Key], [Reference Date])
);
--cartesian join to create a table of row numbers (65,536 rows)
create table #nums(n integer not null primary key);
with
p1 as (select 0 as n union all select 0), -- 2
p2 as (select 0 as n from p1 cross join p1 as b), -- 4
p3 as (select 0 as n from p2 cross join p2 as b), -- 16
p4 as (select 0 as n from p3 cross join p3 as b), -- 256
p5 as (select 0 as n from p4 cross join p4 as b) -- 65536
insert into #nums(n) select row_number() over(order by (select 0)) from p5;
--Create Variables for all of the USD columns to store the missing rows
declare
@prev_Investment nvarchar(200) ,@Investment nvarchar(200),
@prev_HoldingKey int, @HoldingKey int,
@prev_ReferenceDate date ,@ReferenceDate date,
@prev_Holding nvarchar(200) ,@Holding nvarchar(200),
@prev_CurrencyType varchar(100) ,@CurrencyType varchar(100),
@prev_DataAvailability varchar(100),@DataAvailability varchar(100),
@prev_CurrentValue float ,@CurrentValue float,
@prev_CurrentCost float ,@CurrentCost float,
@prev_Realized float ,@Realized float,
@qdiff integer;
--Begin loop for USD to look for missing rows based on Investment, holdingkey and reference date, and add missing data to a variables
declare
c cursor forward_only static read_only for
select Investment, [Holding Key], [Reference Date],Holding,[Currency Type],[Data Availability],[Current Value],[Current Cost],[Realized]
from #TestInvestments_USD_Only
union all
select Investment, [Holding Key], ref_date, NULL, NULL, NULL, NULL, NULL, NULL
from (select dateadd(q,1,max([reference date])) as ref_date from #TestInvestments_USD_Only) as a
cross join (select distinct Investment from #TestInvestments_USD_Only) as b
cross join (select distinct [Holding Key] from #TestInvestments_USD_Only) as c
--cross join (select distinct [Currency Type] from #TestInvestments_USD_Only) as d
order by Investment, [Holding Key], [Reference Date], [Current Value] desc;
open c;
--Set the most recent data = to the reference date we are currently looking at (if missing data, then set the latest data = current quarter in the loop)
fetch next from c into @Investment,@HoldingKey,@ReferenceDate,@Holding,@CurrencyType,@DataAvailability,@CurrentValue,@CurrentCost,@Realized;
select
@prev_Investment=@Investment,
@prev_HoldingKey=@HoldingKey,
@prev_ReferenceDate=@ReferenceDate,
@prev_Holding=@Holding,
@prev_CurrencyType=@CurrencyType,
@prev_DataAvailability=@DataAvailability,
@prev_CurrentValue=@CurrentValue,
@prev_CurrentCost=@CurrentCost,
@prev_Realized=@Realized
;
--Insert the missing data into the temp table
while @@fetch_status = 0 begin
fetch next from c into @Investment,@HoldingKey,@ReferenceDate,@Holding,@CurrencyType,@DataAvailability,@CurrentValue,@CurrentCost,@Realized;
set @qdiff = datediff(q, @prev_ReferenceDate, @ReferenceDate);
if @prev_Investment = @Investment and
@prev_HoldingKey = @HoldingKey and
@qdiff > 1
begin
insert into #TestInvestments_LatestAvailable_USD([Investment],[Holding Key],[Reference Date],[Holding],[Currency Type],[Data Availability],[Current Value],[Current Cost],Realized)
select
@prev_Investment,@prev_HoldingKey,eomonth(dateadd(q, #nums.n, @prev_ReferenceDate)),@prev_Holding,@prev_CurrencyType,@prev_DataAvailability,@prev_CurrentValue,@prev_CurrentCost,@prev_Realized
from #nums
where #nums.n < @qdiff + iif(@CurrentValue = -1, 1, 0);
end;
select
@prev_Investment=@Investment,
@prev_ReferenceDate=@ReferenceDate,
@prev_HoldingKey=@HoldingKey,
@prev_Holding=@Holding,
@prev_CurrencyType=@CurrencyType,
@prev_DataAvailability=@DataAvailability,
@prev_CurrentValue=@CurrentValue,
@prev_CurrentCost=@CurrentCost,
@prev_Realized=@Realized
END
CLOSE c
DEALLOCATE c
--//////////////////////////////////////////////////////////////////////////////////////////////////////
/* ******************* Local Currency Only in this section - below USD is the Local ******************************* */
IF OBJECT_ID('TEMPDB.dbo.#TestInvestments_Local_Only') IS NOT NULL DROP TABLE #TestInvestments_Local_Only
IF OBJECT_ID('TEMPDB.dbo.#Local_nums') IS NOT NULL DROP TABLE #Local_nums
IF OBJECT_ID('TEMPDB.dbo.#TestInvestments_LatestAvailable_Local') IS NOT NULL DROP TABLE #TestInvestments_LatestAvailable_Local
/*create temp table for USD Holdings only - Local Currency is unioned later*/
Select * into #TestInvestments_Local_Only from TestInvestments where [Currency Type] = 'Local'
--create temp table to hold the local missing holdings
CREATE TABLE #TestInvestments_LatestAvailable_Local(
[Reference Date] date NOT NULL
,Investment nvarchar (200) NOT NULL
,[Holding Key] int NOT NULL
,Holding nvarchar (200) NOT NULL
,[Currency Type] [varchar](100) NULL
,[Data Availability] varchar(100) NULL
,[Current Value] [float] NULL
,[Current Cost] [float] NULL
,[Realized] [float] NULL
,primary key(Investment, [Holding Key], [Reference Date])
);
--cartesian join to create a table of row numbers (65,536 rows)
create table #Local_nums(n integer not null primary key);
with
p1 as (select 0 as n union all select 0), -- 2
p2 as (select 0 as n from p1 cross join p1 as b), -- 4
p3 as (select 0 as n from p2 cross join p2 as b), -- 16
p4 as (select 0 as n from p3 cross join p3 as b), -- 256
p5 as (select 0 as n from p4 cross join p4 as b) -- 65536
insert into #Local_nums(n) select row_number() over(order by (select 0)) from p5;
--Create Variables for all of the USD columns to store the missing rows
declare
@local_prev_Investment nvarchar(200) ,@local_Investment nvarchar(200),
@local_prev_HoldingKey int, @local_HoldingKey int,
@local_prev_ReferenceDate date ,@local_ReferenceDate date,
@local_prev_Holding nvarchar(200) ,@local_Holding nvarchar(200),
@local_prev_CurrencyType varchar(100) ,@local_CurrencyType varchar(100),
@local_prev_DataAvailability varchar(100),@local_DataAvailability varchar(100),
@local_prev_CurrentValue float ,@local_CurrentValue float,
@local_prev_CurrentCost float ,@local_CurrentCost float,
@local_prev_Realized float ,@local_Realized float,
@local_qdiff integer;
--Begin loop for USD to look for missing rows and add them as variables
declare
d cursor forward_only static read_only for
select Investment, [Holding Key], [Reference Date],Holding,[Currency Type],[Data Availability],[Current Value],[Current Cost],[Realized]
from #TestInvestments_Local_Only
union all
select Investment, [Holding Key], ref_date, NULL, NULL, NULL, NULL, NULL, NULL
from (select dateadd(q,1,max([reference date])) as ref_date from #TestInvestments_Local_Only) as a
cross join (select distinct Investment from #TestInvestments_Local_Only) as b
cross join (select distinct [Holding Key] from #TestInvestments_Local_Only) as d
--cross join (select distinct [Currency Type] from #TestInvestments_Local_Only) as d
order by Investment, [Holding Key], [Reference Date], [Current Value] desc;
open d;
--Set the most recent data = to the reference date we are currently looking at (if missing data, then set the latest data = current quarter in the loop)
fetch next from d into @local_Investment,@local_HoldingKey,@local_ReferenceDate,@local_Holding,@local_CurrencyType,@local_DataAvailability,@local_CurrentValue,@local_CurrentCost,@local_Realized;
select
@local_prev_Investment=@local_Investment,
@local_prev_HoldingKey=@local_HoldingKey,
@local_prev_ReferenceDate=@local_ReferenceDate,
@local_prev_Holding=@local_Holding,
@local_prev_CurrencyType=@local_CurrencyType,
@local_prev_DataAvailability=@local_DataAvailability,
@local_prev_CurrentValue=@local_CurrentValue,
@local_prev_CurrentCost=@local_CurrentCost,
@local_prev_Realized=@local_Realized;
--Insert the missing data into the temp table
while @@fetch_status = 0 begin
fetch next from d into @local_Investment,@local_HoldingKey,@local_ReferenceDate,@local_Holding,@local_CurrencyType,@local_DataAvailability,@local_CurrentValue,@local_CurrentCost,@local_Realized;
set @local_qdiff = datediff(q, @local_prev_ReferenceDate, @local_ReferenceDate);
if @local_prev_Investment = @local_Investment and
@local_prev_HoldingKey = @local_HoldingKey and
@local_qdiff > 1
begin
insert into #TestInvestments_LatestAvailable_Local([Investment],[Holding Key],[Reference Date],[Holding],[Currency Type],[Data Availability],[Current Value],[Current Cost],Realized)
select
@local_prev_Investment,@local_prev_HoldingKey,eomonth(dateadd(q, #Local_nums.n, @local_prev_ReferenceDate)),@local_prev_Holding,@local_CurrencyType,@local_DataAvailability,@local_CurrentValue,@local_CurrentCost,@local_Realized
from #Local_nums
where #Local_nums.n < @local_qdiff + iif(@local_CurrentValue = -1, 1, 0);
end;
select
@local_prev_Investment=@local_Investment,
@local_prev_HoldingKey=@local_HoldingKey,
@local_prev_ReferenceDate=@local_ReferenceDate,
@local_prev_Holding=@local_Holding,
@local_prev_CurrencyType=@local_CurrencyType,
@local_prev_DataAvailability=@local_DataAvailability,
@local_prev_CurrentValue=@local_CurrentValue,
@local_prev_CurrentCost=@local_CurrentCost,
@local_prev_Realized=@local_Realized;
end
close d
deallocate d
/*Had to add an extra quarter to get the maximum reference date for the current quarter, otherwise the max date was the max date that the fund reported
Below, I delete anything that went past the current MAX reference date for ALL FUNDS (not just a specific fund)*/
DELETE FROM #TestInvestments_LatestAvailable_Local WHERE [Reference Date] >= (select DATEADD(q,1,MAX([Reference Date])) from TestInvestments)
DELETE FROM #TestInvestments_LatestAvailable_USD WHERE [Reference Date] >= (select DATEADD(q,1,MAX([Reference Date])) from TestInvestments)
/*statement to insert missing holdings into current TestInvestments table*/
Insert into TestInvestments ([Data Availability],[Investment],[Holding Key],[Holding],[Reference Date],[Currency Type],[Current Value],[Current Cost],Realized)
select [Data Availability] = 'Include Latest from Previous Submissions',[Investment],[Holding Key],[Holding],[Reference Date],[Currency Type],[Current Value],[Current Cost],Realized
from #TestInvestments_LatestAvailable_Local
union
select [Data Availability] = 'Include Latest from Previous Submissions',[Investment],[Holding Key],[Holding],[Reference Date],[Currency Type],[Current Value],[Current Cost],Realized
from #TestInvestments_LatestAvailable_USD
Thank you so much for any help you can provide!
I wound up being able to fill in gaps in data and bring forward data to present day using a set based solution to replace the cursor I had previously, that was too granular (based on missing holding, instead of missing investment and populating latest holdings based on that investment):
/* -----------First part of code only fills the gaps in the data - does not bring forward data to current date--------- */
/*Get max date to bring holdings forward*/
Drop table if exists #MaxDatefromUholdings
Select Max([Reference Date]) as MaxD
into #MaxDatefromUholdings from TestInvestmentsFixed
Drop table if exists #Quarter
select distinct LastDayOfQuarter into #Quarter
from trsinvsqlp.ctrl.dbo.datedim
where LastDayOfQuarter between '3/1/2018' and (Select MaxD from #MaxDatefromUholdings) --First date we received holdings to max date of the portfolio
Drop table if exists #MissingHolding
/*Get first date an investment was submitted so cross join doesn't populate dates from the datedim table before this*/
;with tbl_Holding as (
select holding,
investment,
[Currency Type],
min([Reference Date]) ReferenceDateMin,
max([Reference Date]) ReferenceDateMax
from TestInvestmentsFixed
group by holding, investment, [Currency Type]
)
/*Get all combinations of dates and holdings*/
select tbl_Holding.holding,
tbl_Holding.investment,
tbl_Holding.[Currency Type],
lastdayofquarter,
tbl_Holding.ReferenceDateMax,
ReferenceDateMin
into #MissingHolding
from tbl_Holding
cross join #Quarter
where 1 = 1
and #Quarter.LastDayOfQuarter >= tbl_Holding.ReferenceDateMin /*Filter cross join - Grab latest available greater than 1st time an investment submitted*/
and #Quarter.LastDayOfQuarter <= ReferenceDateMax /*LastDayofQuart less than last submission date by investment - omit data where a manager hasn't submitted yet - this is taken care of in second part of code*/
and not exists (
select 1
from TestInvestmentsFixed tbl_ActualHolding
where tbl_Holding.[Currency Type] = tbl_ActualHolding.[Currency Type]
and tbl_Holding.investment = tbl_ActualHolding.investment
--and tbl_Holding.Holdingskey = tbl_ActualHolding.Holdingskey
and #Quarter.LastDayOfQuarter = tbl_ActualHolding.[Reference Date]
) --Finds data in the cross joined table, not currently in TestInvestmentsFixed
Drop table if exists #HoldingsGapsFilled
;with tbl_MissingHolding as (
select #MissingHolding.holding as L_holding,
[L_Data Availability Missing] = 'Include Latest from Previous Submissions',
#MissingHolding.investment as L_investment,
#MissingHolding.[Currency Type] as [L_Currency Type],
#MissingHolding.LastDayOfQuarter,
#MissingHolding.ReferenceDateMin,
max(TestInvestmentsFixed.[Reference Date]) ReferenceDateLast
from #MissingHolding
join TestInvestmentsFixed
on #MissingHolding.holding=TestInvestmentsFixed.Holding
and #MissingHolding.investment = TestInvestmentsFixed.Investment
and #MissingHolding.[Currency Type] = TestInvestmentsFixed.[currency type]
and #MissingHolding.LastDayOfQuarter > TestInvestmentsFixed.[Reference Date] /*Pulls in latest within data gaps available only*/
group by #MissingHolding.holding,
#MissingHolding.investment,
#MissingHolding.[Currency Type],
#MissingHolding.LastDayOfQuarter,
#MissingHolding.ReferenceDateMin
)
/*Add rest of columns from TestInvestmentsFixed to latest available data*/
select tbl_MissingHolding.*,TestInvestmentsFixed.*
into #HoldingsGapsFilled
from tbl_MissingHolding
join TestInvestmentsFixed
on tbl_MissingHolding.L_holding=TestInvestmentsFixed.Holding
and tbl_MissingHolding.L_investment = TestInvestmentsFixed.Investment
and tbl_MissingHolding.[L_Currency Type] = TestInvestmentsFixed.[currency type]
and tbl_MissingHolding.ReferenceDateLast = TestInvestmentsFixed.[Reference Date]
/* ----------Second part of code brings forward investments from their max submission date from the last date that an investment submitted--------- */
Drop table if exists #HoldingsSinceLatestSubmission
--Get last submission date for each investment
;With AddFutureStuff as (
select distinct
investment as Distinct_Investment,
max([Reference Date]) LastSubmissionDate
from TestInvestmentsFixed
group by investment
),
--Get max date for entire portfolio
MaxDateofPortfolio as (
Select max([Reference Date]) as MaxPortfolioDate
from TestInvestmentsFixed
),
--Combine last date an investment submitted with the max date for the entire portfolio
DatesBetween as (
select Distinct_Investment, LastSubmissionDate, (select MaxPortfolioDate from MaxDateofPortfolio) as MaxPortfolioDate
from AddFutureStuff
)
select #Quarter.LastDayOfQuarter,
DatesBetween.*,
TestInvestmentsFixed.*
into #HoldingsSinceLatestSubmission
from DatesBetween
--Join TestInvestmentsFixed to pull in all other columns from TestInvestmentsFixed with last submission dates and max portfolio date
join TestInvestmentsFixed
on TestInvestmentsFixed.[Reference Date] = DatesBetween.LastSubmissionDate
and TestInvestmentsFixed.investment = DatesBetween.Distinct_Investment
--Cross join to populate table for all dates between last submission date (by investment) and max portfolio date
cross join #Quarter
where #quarter.LastDayOfQuarter > LastSubmissionDate and #quarter.LastDayOfQuarter <= MaxPortfolioDate --Ensures cross join is only between latest submission and max Portfolio date
and LastSubmissionDate <> MaxPortfolioDate --Remove any holdings submitted in the current quarter
order by TestInvestmentsFixed.Investment, #Quarter.LastDayOfQuarter