Disclaimer: not my code, not my database design!
I have a column of censusblocks(varbinary(max), null)
in a MS SQL Server 2008 db table (call it foo
for simplicity).
This column is actually a null
or 1 to n long list of int
. The int
s are actually foreign keys to another table (call it censusblock
with a pk id
of type of int
), numbering from 1 to ~9600000.
I want to query to extract the censusblocks
list from foo
, and use the extracted list of int
from each row to look up the corresponding censusblock
row. There's a long, boring rest of the query that will be used from there, but it needs to start with the census blocks pulled from the foo
table's censusblocks
column.
This conversion-and-look-up is currently handled on the middle tier, with a small .NET utility class to convert from List<int>
to byte[]
(and vice versa), which is then written into/read from the db as varbinary
. I would like to do the same thing, purely in SQL.
The desired query would go something along the lines of
SELECT f.id, c.id
FROM foo f
LEFT OUTER JOIN censusblock c ON
c.id IN f.censusblocks --this is where the magic happens
where f.id in (1,2)
Which would result in:
f.id | c.id
1 8437314
1 8438819
1 8439744
1 8441795
1 8442741
1 8444984
1 8445568
1 8445641
1 8447953
2 5860657
2 5866881
2 5866881
2 5866858
2 5862557
2 5870475
2 5868983
2 5865207
2 5863465
2 5867301
2 5864057
2 5862256
NB: the 7-digit results are coincidental. The range is, as stated above, 1-7 digits.
The actual censusblocks
column looks like
SELECT TOP 2 censusblocks FROM foo
which results in
censublocks
0x80BE4280C42380C7C080CFC380D37580DC3880DE8080DEC980E7D1
0x596D3159858159856A59749D59938B598DB7597EF7597829598725597A79597370
For further clarification, here's the guts of the .NET utility classes conversion methods:
public static List<int> getIntegersFromBytes(byte[] data)
{
List<int> values = new List<int>();
if (data != null && data.Length > 2)
{
long ids = data.Length / 3;
byte[] oneId = new byte[4];
oneId[0] = 0;
for (long i = 0; i < ids; i++)
{
oneId[0] = 0;
Array.Copy(data, i * 3, oneId, 1, 3);
if (BitConverter.IsLittleEndian)
{ Array.Reverse(oneId); }
values.Add(BitConverter.ToInt32(oneId, 0));
}}
return values;
}
public static byte[] getBytesFromIntegers(List<int> values)
{
byte[] data = null;
if (values != null && values.Count > 0)
{
data = new byte[values.Count * 3];
int count = 0;
byte[] idBytes = null;
foreach (int id in values)
{
idBytes = BitConverter.GetBytes(id);
if (BitConverter.IsLittleEndian)
{ Array.Reverse(idBytes); }
Array.Copy(idBytes, 1, data, count * 3, 3);
count++;
} }
return data;
}
An example of how this might be done. It is unlikely to scale brilliantly.
If you have a numbers table in your database it should be used in place of nums_cte
.
This works by converting the binary value to a literal hex string, then reading it in 8-character chunks
-- create test data
DECLARE @foo TABLE
(id int ,
censusblocks varbinary(max)
)
DECLARE @censusblock TABLE
(id int)
INSERT @censusblock (id)
VALUES(1),(2),(1003),(5030),(5031),(2),(6)
INSERT @foo (id,censusblocks)
VALUES (1,0x0000000100000002000003EB),
(2,0x000013A6000013A7)
--query
DECLARE @biMaxLen bigint
SELECT @biMaxLen = MAX(LEN(CONVERT(varchar(max),censusblocks,2))) FROM @foo
;with nums_cte
AS
(
SELECT TOP (@biMaxLen) ((ROW_NUMBER() OVER (ORDER BY a.type) - 1) * 8) AS n
FROM master..spt_values as a
CROSS JOIN master..spt_values as b
)
,binCTE
AS
(
SELECT d.id, CAST(CONVERT(binary(4),SUBSTRING(s,n + 1,8),2) AS int) as cblock
FROM (SELECT Id, CONVERT(varchar(max),censusblocks,2) AS s FROM @foo) AS d
JOIN nums_cte
ON n < LEN(d.s)
)
SELECT *
FROM binCTE as b
LEFT
JOIN @censusblock c
ON c.id = b.cblock
ORDER BY b.id, b.cblock
You could also consider adding your existing .Net conversion methods into the database as an assembly and accessing them through CLR functions.