I have coded two T-SQL queries which are executed on Databricks using Databricks SQL.
The first query returns results of 144 rows:
144 Rows Returned
SELECT DISTINCT
PSTT.TransactionTypeID
FROM basedrd.TransactionType PSTT
LEFT OUTER JOIN basedrd.PMSTransaction ST
ON PSTT.TransactionType = ST.TransactionType
WHERE PSTT.TransactionSource = 'Pershing'
OR (PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
The second query doesn't return any rows
No Rows returned
SELECT DISTINCT
PSTT.TransactionTypeID
FROM basedrd.PMSTransaction ST
LEFT OUTER JOIN basedrd.TransactionType PSTT
ON PSTT.TransactionType = ST.TransactionType
AND (PSTT.TransactionSource = 'Pershing'
OR PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
The stranges thing is that if I execute the same code on our SQL Server I get 145 rows returned onf the first query
145 rows returned
SELECT DISTINCT
PSTT.TransactionTypeID
FROM dbo.TransactionType PSTT
LEFT OUTER JOIN dbo.PMSTransaction ST
ON PSTT.TransactionType = ST.TransactionType
WHERE PSTT.TransactionSource = 'Pershing'
OR (PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
And whereas I wasn't getting any rows returned with the second query on Databricks above on our SQL Server I get 46 rows returned.
SELECT DISTINCT
PSTT.TransactionTypeID
FROM dbo.PMSTransaction ST
LEFT OUTER JOIN dbo.TransactionType PSTT
ON PSTT.TransactionType = ST.TransactionType
AND (PSTT.TransactionSource = 'Pershing'
OR PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
The only difference between the query executed on Databricks SQL and SQL Server is the database name(schema) SQL Server is dbo, whereas databricks is basedrd.
It appears that your data doesn't match up. A query like should help confirm that the values don't line up as expected:
with data as (
select
case when PSTT.TransactionSource = 'Pershing' then 'Pershing'
when PSTT.TransactionSource = '' then 'Blank'
when ST.TransactionType is not null then
case when PSTT.TransactionType is not null then 'Other' else 'Null' end
else 'Unmatched Row' end as TransactionSource,
case when ST.SEDOL = 'CASH' then 'Cash'
when ST.SEDOL is null then 'Null'
else 'Other' end as SEDOL,
case when PSTT.IsCashTransaction = 0 then '0'
when PSTT.IsCashTransaction = 1 then '1'
when PSTT.IsCashTransaction is null then 'Null'
else 'Other' end as IsCashTransaction
from <SRC>.PMSTransaction ST left outer join <SRC>.TransactionType PSTT
on PSTT.TransactionType = ST.TransactionType
)
select TransactionSource, SEDOL, IsCashTransaction, count(*)
from data
group by TransactionSource, SEDOL, IsCashTransaction
order by TransactionSource, SEDOL, IsCashTransaction;